Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve normalizer performance by adjusting the trie value format #5813

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

hsivonen
Copy link
Member

With the fast trie type, I see this kind of performance improvement:

el_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [3.0115 µs 3.0127 µs 3.0141 µs]
                        thrpt:  [679.47 Melem/s 679.78 Melem/s 680.06 Melem/s]
                 change:
                        time:   [-35.114% -35.083% -35.049%] (p = 0.00 < 0.05)
                        thrpt:  [+53.963% +54.042% +54.117%]
                        Performance has improved.

el_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [4.4824 µs 4.4837 µs 4.4851 µs]
                        thrpt:  [456.62 Melem/s 456.77 Melem/s 456.90 Melem/s]
                 change:
                        time:   [-30.365% -30.238% -30.102%] (p = 0.00 < 0.05)
                        thrpt:  [+43.065% +43.344% +43.605%]
                        Performance has improved.

el_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [4.4836 µs 4.4848 µs 4.4859 µs]
                        thrpt:  [456.54 Melem/s 456.66 Melem/s 456.78 Melem/s]
                 change:
                        time:   [-31.927% -31.836% -31.751%] (p = 0.00 < 0.05)
                        thrpt:  [+46.522% +46.705% +46.901%]
                        Performance has improved.

el_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [11.465 µs 11.491 µs 11.514 µs]
                        thrpt:  [194.89 Melem/s 195.29 Melem/s 195.72 Melem/s]
                 change:
                        time:   [-14.115% -14.021% -13.925%] (p = 0.00 < 0.05)
                        thrpt:  [+16.177% +16.307% +16.435%]
                        Performance has improved.

en_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [990.20 ns 990.50 ns 990.80 ns]
                        thrpt:  [2.0670 Gelem/s 2.0676 Gelem/s 2.0683 Gelem/s]
                 change:
                        time:   [-2.0851% -1.9873% -1.8175%] (p = 0.00 < 0.05)
                        thrpt:  [+1.8512% +2.0275% +2.1295%]
                        Performance has improved.

en_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [704.59 ns 705.45 ns 706.47 ns]
                        thrpt:  [2.8989 Gelem/s 2.9031 Gelem/s 2.9066 Gelem/s]
                 change:
                        time:   [-30.362% -30.311% -30.265%] (p = 0.00 < 0.05)
                        thrpt:  [+43.401% +43.494% +43.599%]
                        Performance has improved.

en_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [704.27 ns 704.57 ns 705.05 ns]
                        thrpt:  [2.9048 Gelem/s 2.9067 Gelem/s 2.9080 Gelem/s]
                 change:
                        time:   [-30.268% -30.188% -30.087%] (p = 0.00 < 0.05)
                        thrpt:  [+43.035% +43.242% +43.406%]
                        Performance has improved.

en_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [991.99 ns 992.27 ns 992.55 ns]
                        thrpt:  [2.0634 Gelem/s 2.0640 Gelem/s 2.0645 Gelem/s]
                 change:
                        time:   [-2.0092% -1.9614% -1.9088%] (p = 0.00 < 0.05)
                        thrpt:  [+1.9460% +2.0006% +2.0504%]
                        Performance has improved.

fr_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [984.23 ns 984.47 ns 984.72 ns]
                        thrpt:  [2.0798 Gelem/s 2.0803 Gelem/s 2.0808 Gelem/s]
                 change:
                        time:   [-2.0970% -1.9296% -1.8348%] (p = 0.00 < 0.05)
                        thrpt:  [+1.8691% +1.9675% +2.1419%]
                        Performance has improved.

fr_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [1.5205 µs 1.5215 µs 1.5223 µs]
                        thrpt:  [1.3453 Gelem/s 1.3460 Gelem/s 1.3469 Gelem/s]
                 change:
                        time:   [-24.128% -23.976% -23.831%] (p = 0.00 < 0.05)
                        thrpt:  [+31.286% +31.538% +31.801%]
                        Performance has improved.

fr_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [1.5139 µs 1.5159 µs 1.5177 µs]
                        thrpt:  [1.3494 Gelem/s 1.3510 Gelem/s 1.3528 Gelem/s]
                 change:
                        time:   [-22.127% -22.036% -21.950%] (p = 0.00 < 0.05)
                        thrpt:  [+28.123% +28.265% +28.414%]
                        Performance has improved.

fr_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [3.5719 µs 3.5739 µs 3.5760 µs]
                        thrpt:  [588.65 Melem/s 588.99 Melem/s 589.33 Melem/s]
                 change:
                        time:   [-4.9182% -4.8589% -4.7968%] (p = 0.00 < 0.05)
                        thrpt:  [+5.0385% +5.1070% +5.1726%]
                        Performance has improved.

ja_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [3.3380 µs 3.3388 µs 3.3394 µs]
                        thrpt:  [613.28 Melem/s 613.40 Melem/s 613.53 Melem/s]
                 change:
                        time:   [-42.084% -42.055% -42.027%] (p = 0.00 < 0.05)
                        thrpt:  [+72.495% +72.578% +72.664%]
                        Performance has improved.

ja_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [4.6673 µs 4.6767 µs 4.6874 µs]
                        thrpt:  [436.91 Melem/s 437.92 Melem/s 438.79 Melem/s]
                 change:
                        time:   [-28.384% -28.291% -28.174%] (p = 0.00 < 0.05)
                        thrpt:  [+39.226% +39.453% +39.633%]
                        Performance has improved.

ja_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [4.8018 µs 4.8065 µs 4.8115 µs]
                        thrpt:  [425.65 Melem/s 426.09 Melem/s 426.50 Melem/s]
                 change:
                        time:   [-27.291% -27.215% -27.147%] (p = 0.00 < 0.05)
                        thrpt:  [+37.262% +37.391% +37.534%]
                        Performance has improved.

ja_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [8.8205 µs 8.8228 µs 8.8250 µs]
                        thrpt:  [246.01 Melem/s 246.07 Melem/s 246.13 Melem/s]
                 change:
                        time:   [-14.915% -14.811% -14.716%] (p = 0.00 < 0.05)
                        thrpt:  [+17.255% +17.386% +17.530%]
                        Performance has improved.

kn_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [8.1968 µs 8.2000 µs 8.2032 µs]
                        thrpt:  [249.66 Melem/s 249.76 Melem/s 249.85 Melem/s]
                 change:
                        time:   [-12.150% -12.094% -12.035%] (p = 0.00 < 0.05)
                        thrpt:  [+13.681% +13.758% +13.831%]
                        Performance has improved.

kn_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [4.4757 µs 4.4765 µs 4.4774 µs]
                        thrpt:  [457.41 Melem/s 457.50 Melem/s 457.59 Melem/s]
                 change:
                        time:   [-26.836% -26.756% -26.660%] (p = 0.00 < 0.05)
                        thrpt:  [+36.352% +36.529% +36.679%]
                        Performance has improved.

kn_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [3.7885 µs 3.7893 µs 3.7901 µs]
                        thrpt:  [540.35 Melem/s 540.47 Melem/s 540.59 Melem/s]
                 change:
                        time:   [-31.691% -31.619% -31.551%] (p = 0.00 < 0.05)
                        thrpt:  [+46.094% +46.239% +46.394%]
                        Performance has improved.

kn_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [10.406 µs 10.411 µs 10.417 µs]
                        thrpt:  [202.08 Melem/s 202.18 Melem/s 202.29 Melem/s]
                 change:
                        time:   [-9.8301% -9.7583% -9.6878%] (p = 0.00 < 0.05)
                        thrpt:  [+10.727% +10.814% +10.902%]
                        Performance has improved.

ko_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [2.7431 µs 2.7435 µs 2.7440 µs]
                        thrpt:  [746.36 Melem/s 746.48 Melem/s 746.60 Melem/s]
                 change:
                        time:   [-33.624% -33.575% -33.534%] (p = 0.00 < 0.05)
                        thrpt:  [+50.454% +50.547% +50.658%]
                        Performance has improved.

ko_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [18.572 µs 18.579 µs 18.587 µs]
                        thrpt:  [110.19 Melem/s 110.23 Melem/s 110.27 Melem/s]
                 change:
                        time:   [-5.1016% -4.9844% -4.8827%] (p = 0.00 < 0.05)
                        thrpt:  [+5.1334% +5.2459% +5.3758%]
                        Performance has improved.

ko_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [6.5094 µs 6.5145 µs 6.5199 µs]
                        thrpt:  [314.12 Melem/s 314.37 Melem/s 314.62 Melem/s]
                 change:
                        time:   [-38.693% -38.636% -38.575%] (p = 0.00 < 0.05)
                        thrpt:  [+62.801% +62.961% +63.113%]
                        Performance has improved.

ko_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [39.109 µs 39.145 µs 39.181 µs]
                        thrpt:  [102.86 Melem/s 102.95 Melem/s 103.05 Melem/s]
                 change:
                        time:   [-3.7017% -3.6061% -3.5037%] (p = 0.00 < 0.05)
                        thrpt:  [+3.6309% +3.7410% +3.8440%]
                        Performance has improved.

vi_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [1.3298 µs 1.3313 µs 1.3331 µs]
                        thrpt:  [1.5363 Gelem/s 1.5384 Gelem/s 1.5401 Gelem/s]
                 change:
                        time:   [-14.921% -14.827% -14.696%] (p = 0.00 < 0.05)
                        thrpt:  [+17.228% +17.408% +17.538%]
                        Performance has improved.

vi_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [7.3388 µs 7.3408 µs 7.3428 µs]
                        thrpt:  [278.91 Melem/s 278.99 Melem/s 279.06 Melem/s]
                 change:
                        time:   [-10.183% -10.060% -9.9402%] (p = 0.00 < 0.05)
                        thrpt:  [+11.037% +11.185% +11.337%]
                        Performance has improved.

vi_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [6.7638 µs 6.7926 µs 6.8147 µs]
                        thrpt:  [300.53 Melem/s 301.51 Melem/s 302.79 Melem/s]
                 change:
                        time:   [-5.9909% -5.4787% -4.8880%] (p = 0.00 < 0.05)
                        thrpt:  [+5.1392% +5.7963% +6.3727%]
                        Performance has improved.

vi_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [21.887 µs 21.897 µs 21.908 µs]
                        thrpt:  [119.09 Melem/s 119.15 Melem/s 119.21 Melem/s]
                 change:
                        time:   [-6.6439% -6.5689% -6.4944%] (p = 0.00 < 0.05)
                        thrpt:  [+6.9454% +7.0308% +7.1167%]
                        Performance has improved.

vi_orthographic_to_nfc_utf16/icu4x                                                                             
                        time:   [19.753 µs 19.766 µs 19.780 µs]
                        thrpt:  [120.63 Melem/s 120.71 Melem/s 120.79 Melem/s]
                 change:
                        time:   [-2.8159% -2.7400% -2.6556%] (p = 0.00 < 0.05)
                        thrpt:  [+2.7280% +2.8172% +2.8974%]
                        Performance has improved.

vi_orthographic_to_nfd_utf16/icu4x                                                                             
                        time:   [7.0146 µs 7.0182 µs 7.0223 µs]
                        thrpt:  [339.78 Melem/s 339.97 Melem/s 340.15 Melem/s]
                 change:
                        time:   [-12.492% -12.445% -12.397%] (p = 0.00 < 0.05)
                        thrpt:  [+14.151% +14.214% +14.275%]
                        Performance has improved.

zh_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [3.2568 µs 3.2577 µs 3.2586 µs]
                        thrpt:  [628.49 Melem/s 628.67 Melem/s 628.83 Melem/s]
                 change:
                        time:   [-35.288% -35.198% -35.146%] (p = 0.00 < 0.05)
                        thrpt:  [+54.194% +54.317% +54.530%]
                        Performance has improved.

zh_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [2.8441 µs 2.8452 µs 2.8464 µs]
                        thrpt:  [719.50 Melem/s 719.80 Melem/s 720.09 Melem/s]
                 change:
                        time:   [-38.993% -38.911% -38.836%] (p = 0.00 < 0.05)
                        thrpt:  [+63.495% +63.696% +63.914%]
                        Performance has improved.

zh_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [2.8525 µs 2.8540 µs 2.8555 µs]
                        thrpt:  [717.21 Melem/s 717.59 Melem/s 717.97 Melem/s]
                 change:
                        time:   [-39.014% -38.907% -38.811%] (p = 0.00 < 0.05)
                        thrpt:  [+63.429% +63.685% +63.971%]
                        Performance has improved.

zh_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [3.2835 µs 3.2847 µs 3.2860 µs]
                        thrpt:  [623.56 Melem/s 623.80 Melem/s 624.02 Melem/s]
                 change:
                        time:   [-34.955% -34.935% -34.913%] (p = 0.00 < 0.05)
                        thrpt:  [+53.641% +53.693% +53.739%]
                        Performance has improved.

@hsivonen hsivonen added A-performance Area: Performance (CPU, Memory) C-collator Component: Collation, normalization 2.0-breaking Changes that are breaking API changes labels Nov 13, 2024
@hsivonen
Copy link
Member Author

@hsivonen
Copy link
Member Author

ICU4C PR: unicode-org/icu#3269

Manishearth
Manishearth previously approved these changes Nov 13, 2024
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landable for the purpose of 2.0, but I think this could have a couple more pointers in the docs and be more encapsulated.

components/normalizer/trie-value-format.md Show resolved Hide resolved
/// Getting a zero from this trie means that you need
/// to make another lookup from `DecompositionDataV1::trie`.
pub struct DecompositionDataV2<'data> {
/// Trie for decomposition.
#[cfg_attr(feature = "serde", serde(borrow))]
pub trie: CodePointTrie<'data, u32>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: I feel like the packed code logic is all scattered. Can we use a structured NormalizationTrieValue(pub u32) type that has convenience methods for getting all the fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that what you suggest would be better for encapsulation. However, given that prior to this PR there was no such encapsulation and I'm already way over my time budget for this, I would very much prefer landing this ASAP (before 2.0 and before this bitrots) without such a refactoring and leaving the refactoring as a follow-up.

components/normalizer/src/provider.rs Show resolved Hide resolved
@hsivonen hsivonen added the discuss-priority Discuss at the next ICU4X meeting label Nov 14, 2024
@hsivonen
Copy link
Member Author

I tested this with the normalization test suite. Also, I tested that UTS 46 still work: https://github.com/hsivonen/rust-url/tree/unicode16 https://github.com/hsivonen/idna_adapter/tree/icu4x-trunk

@hsivonen
Copy link
Member Author

CI showing ffi/harfbuzz/src/lib.rs in a state that doesn't match what the PR diff viewer shows confuses me.

@sffc sffc added this to the ICU4X 2.0 ⟨P1⟩ milestone Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0-breaking Changes that are breaking API changes A-performance Area: Performance (CPU, Memory) C-collator Component: Collation, normalization discuss-priority Discuss at the next ICU4X meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants