RE: Case mapping of dotless lowercase letters

From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Wed Dec 17 2003 - 08:37:46 EST

  • Next message: Arcane Jill: "RE: Case mapping of dotless lowercase letters"

    [resending; better set the encoding to UTF-8...]

    Peter Kirk wrote:
    ...
    > used on Turkic text, violates the very sensible rule DO NOT USE
    > COMBINING DOTS WITH I's, and leads to all sorts of potential
    > confusion
    > e.g. that both simple and full case folding and lowercasing
    > applied to
    > NFD Turkic text generate the nonsensical <i, dot above>. This
    > could be a
    > serious problem - although one that may not be worth fixing.

    <i, dot above> is not non-sensical. It is used in Lithuanian for
    such things as <i, dot above, tilde above>, as well as other
    additonal accents above an i or a j that keeps its dot.

                    /kent k

    Lithuanian alphabet (not listing all the uppercase
    accented letters)

     Aa (Àà, Áá Ãã Ąą {Ą́}{ą́}), Bb, Cc (CHch), Čč, Dd,
     Ee (Ęę, Ėė è é ẽ ę {ę́} {ę̃} ė {ė́} {ė̃}), Ff, Gg, Hh,
     Ii (Ì{i̇̀} Í{i̇́} Ĩ{i̇̃} Įį {Į́}{į̇́} {Į̃}{į̇̃}, Yy, Ýý, Ỹỹ),
     Jj ({J̃}{j̇̃}), Kk, Ll ({l̃}), Mm ({m̃}), Nn (Ññ),
     Oo (ò, ó, õ), Pp, [Qq], Rr (r̃), Ss, Šš, Tt,
     Uu (ù ú ũ Ųų {ų́} {ų̃} Ūū {ū́}), Vv, [Ww], [Xx], Zz, Žž



    This archive was generated by hypermail 2.1.5 : Wed Dec 17 2003 - 09:24:54 EST