Replies: 1 comment 5 replies
-
ICU4X supports Han text in UTF-8 and UTF-16. The following Han orderings are available:
Note that big5han and gb2312han need to be explicitly enabled in datagen. When used, they collate the text in the order that would result if the text were in either of those two encodings. However, they are not recommended for modern use; one of the other orderings, such as "pinyin" or "stroke", should produce the best quality collation results. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
as rust only support utf-8 encoding string, and the collator api use str slice as parameter, so seems it's not feasible to pass gb18030 encoding bytes?
Beta Was this translation helpful? Give feedback.
All reactions