From: Clark Cox (clarkcox3@mac.com)
Date: Sat Jan 10 2004 - 11:22:48 EST
I'm in the process of writing several normalization routines, and
testing them against NormalizationTest.txt. The code that I use to do
the composition for NFC and NFKC seems to work for every line in the
test file, except for a 21 of them. An example of where my routine
falls down is with the line:
1026;1026;1025 102E;1026;1025 102E; # (ဦ; ဦ; ဥ◌ီ; ဦ; ဥ◌ီ; ) MYANMAR
LETTER UU
According to the comment at the beginning of the file, and all that
I've read elsewhere, toNFC(U+1025 U+102E) should result in U+1026.
However both U+1025 and U+102E have combining classes of zero, so my
code does not compose those characters. No information that I've been
able to find has been able to explain this discrepancy. Any help would
be greatly appreciated.
-- Clark S. Cox III clarkcox3@mac.com http://homepage.mac.com/clarkcox3/ http://homepage.mac.com/clarkcox3/blog/B1196589870/index.html
This archive was generated by hypermail 2.1.5 : Sat Jan 10 2004 - 11:52:46 EST