From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Dec 17 2003 - 05:40:46 EST
On 16/12/2003 17:21, Kenneth Whistler wrote:
>Correcting myself:
>
>
>
>>Note that none of the 3 sets of equivalence classes violates
>>*canonical* equivalence, because none of the 8 sequences involved
>>is canonically equivalent to any other. In other words, no matter
>>which of the 3 approaches you take to case folding, in no instance
>>are you claiming that canonically equivalent sequences are to be
>>interpreted differently.
>>
>>
>
>Actually, dotted I *is* canonically equivalent to <I, dot above>
>(I overlooked that when compiling the summary.)
>
>
>
This implies (since there are no decomposition exclusions) that NFD,
used on Turkic text, violates the very sensible rule DO NOT USE
COMBINING DOTS WITH I's, and leads to all sorts of potential confusion
e.g. that both simple and full case folding and lowercasing applied to
NFD Turkic text generate the nonsensical <i, dot above>. This could be a
serious problem - although one that may not be worth fixing.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Wed Dec 17 2003 - 06:20:18 EST