On 2013年2月1日, at 上午6:07, "Costello, Roger L." <costello_at_mitre.org> wrote:
> So why would one ever generate text in decomposed form (NFD)?
>
The Unihan database is stored in NFD because it makes the regular expressions used to qualify its contents much, *much* simpler. I imagine that things like fuzzy text matching are easier in NFD. At worst, it's about as useful as UTF-32: occasionally very handy in internal processing, but not terribly attactive overall.
Received on Fri Feb 01 2013 - 11:57:00 CST
This archive was generated by hypermail 2.2.0 : Fri Feb 01 2013 - 11:57:01 CST