Re: Case mapping of dotless lowercase letters

From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Dec 17 2003 - 05:40:46 EST

  • Next message: Peter Kirk: "Re: Case mapping of dotless lowercase letters"

    On 16/12/2003 17:21, Kenneth Whistler wrote:

    >Correcting myself:
    >
    >
    >
    >>Note that none of the 3 sets of equivalence classes violates
    >>*canonical* equivalence, because none of the 8 sequences involved
    >>is canonically equivalent to any other. In other words, no matter
    >>which of the 3 approaches you take to case folding, in no instance
    >>are you claiming that canonically equivalent sequences are to be
    >>interpreted differently.
    >>
    >>
    >
    >Actually, dotted I *is* canonically equivalent to <I, dot above>
    >(I overlooked that when compiling the summary.)
    >
    >
    >
    This implies (since there are no decomposition exclusions) that NFD,
    used on Turkic text, violates the very sensible rule DO NOT USE
    COMBINING DOTS WITH I's, and leads to all sorts of potential confusion
    e.g. that both simple and full case folding and lowercasing applied to
    NFD Turkic text generate the nonsensical <i, dot above>. This could be a
    serious problem - although one that may not be worth fixing.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Wed Dec 17 2003 - 06:20:18 EST