Re: Tentative Definition of Casefolding

From: SADAHIRO Tomoyuki (bqw10602@nifty.com)
Date: Sat Jun 10 2006 - 22:03:35 CDT

  • Next message: N. Ganesan: "Yahoo groups support for Unicode"

    On Sun, 11 Jun 2006 00:09:15 +0100, "Richard Wordingham" <richard.wordingham@ntlworld.com> wrote

    > C: Draft form:
    > If uc(X) and uc(Y) are canonically equivalent, lc(X) and lc(Y) are
    > canonically equivalent, and tc(X) and tc(Y) are canonically equivalent, so
    > are f(X) and f(Y).

    > D: Draft form:
    > If X is the concatenation of X1 and X2, lc(X) is the concatenation of lc(X1)
    > and lc(X2), uc(X) is the concatenation of uc(X1) and uc(X2), and tc(X) is
    > the concatenation of tc(X1) and lc(X2) [N.B. lc, not tc!] then f(X) is the
    > concatenation of f(X1) and f(X2).

    Interesting. Do these drafts imply that if at least one of uppercase,
    lowercase, and titlecase is decomposed, then *all* of the cases must
    be decomposed?

    [CURRENT] code; lower; title; upper;
    00DF; 00DF; 0053 0073; 0053 0053; # latin small sharp S
    FB00; FB00; 0046 0066; 0046 0046; # latin small lig. FF
    0149; 0149; 02BC 004E; 02BC 004E; # latin small N prec. by apos.
    1FB3; 1FB3; 1FBC; 0391 0399; # greek small ALPHA w. YPOGEGRAMMENI
    1FBC; 1FB3; 1FBC; 0391 0399; # greek capital ALPHA w. PROSGEGRAMMENI

    [according to the PROPOSAL] code; lower; title; upper;
    00DF; 0073 0073; 0053 0073; 0053 0053; # latin small sharp S
    FB00; 0066 0066; 0046 0066; 0046 0046; # latin small lig. FF
    0149; 02BC 006E; 02BC 004E; 02BC 004E; # latin small N prec. by apos.
    1FB3; 03B1 03B9; 0391 03B9; 0391 0399; # greek small ALPHA w. YPOGEGRAMMENI
    1FBC; 03B1 03B9; 0391 03B9; 0391 0399; # greek capital ALPHA w. PROSGEGRAMMENI

    > K: Draft Form:
    > If uc(X) and uc(Y) are compatibility equivalent, lc(X) and lc(Y) are
    > compatibility equivalent, and tc(X) and tc(Y) are compatibility equivalent,
    > so are f(X) and f(Y).

    Do you mean "If X and Y are compatibility equivalent, then uc(X) and
    uc(Y) are compatibility equivalent, lc(X) and lc(Y) are compatibility
    equivalent, and tc(X) and tc(Y) are compatibility equivalent,
    so are f(X) and f(Y)." ?

    According to the draft form K (original), the case mappings of
    SQUARE MV MEGA (U+33B9) will be same as those of the sequence of
    Latin <M, V>, but the case mappings of SQUARE MV (U+33B7) are
    different.
    According to the draft form K implying "If X and Y are compatibility
    equivalent", the case mappings of both SQUARE MV MEGA and SQUARE MV
    will be same as those of the sequence of Latin <M, V>.
    (Note: we have square mV and MV but we don't have square mv and Mv.)

    But the case folding among SQUARE MV (millivolt), SQUARE MV MEGA
    (megavolt) and sequence <m, v> in the bicameral script has usefulness?
    In my opinion, case mappings respecting and/or preserving compatible
    equivalence are not good idea. Some compatibility decomposable characters
    will lost their meanings significantly through such a case folding.

    Regards,
    SADAHIRO Tomoyuki



    This archive was generated by hypermail 2.1.5 : Sat Jun 10 2006 - 22:03:37 CDT