Hi Folks,
Thank you for your excellent responses.
Based on your responses, I now wonder why the W3C recommends NFC be used for text exchanges over the Internet. Aside from the size advantage of NFC, there seems to be tremendous advantages to using NFD:
- It’s easier to do searches and other text processing on NFD-encoded text.
- NFD makes the regular expressions used to qualify its contents much, *much* simpler.
- Things like fuzzy text matching are probably easier in NFD.
- It’s easier to remember a handful of useful composing accents than the much larger number of combined forms.
- It is easier to use a few keystrokes for combining accents than to set up compose key sequences for all the possible composed characters.
- Some Unicode-defined processes, such as capitalization, are not guaranteed to preserve normalization forms.
- Some operating systems store filenames in NFD encoding.
The W3C is currently updating their recommendations [1]:
This version of this document was published to
indicate the Internationalization Core Working
Group's intention to substantially alter or replace
the recommendations found here with very different
recommendations in the near future.
Would you recommend that the W3C change their recommendation from:
Use NFC when exchanging text over the Internet.
to:
Use NFD when exchanging text over the Internet.
Would that be your recommendation to the W3C?
/Roger
[1] http://www.w3.org/TR/charmod-norm/
Received on Sun Feb 03 2013 - 07:31:55 CST
This archive was generated by hypermail 2.2.0 : Sun Feb 03 2013 - 07:31:56 CST