Re: What are the issues in having U+FB06 fold to U+FB05?

From: Mark Davis ☕ (mark@macchiato.com)
Date: Wed Jun 08 2011 - 16:33:47 CDT

  • Next message: Peter Constable: "RE: Character Identity and Font Selection"

    As to the first, it would seem reasonable. The simple folding is not covered
    by the following stability policies:

    http://www.unicode.org/policies/stability_policy.html#Case_Folding
    http://www.unicode.org/policies/stability_policy.html#Case_Pair

    However, the committee may be leery of changing these even though they are
    not covered by those policies. You can file a request form for the committee
    to consider it, at http://unicode.org/reporting.html

    The other two are special cases; they casefold together because of the way
    that the full case mapping is computed. Their equivalence is normally
    captured by a canonical-equivalent folding. Because the simple folding is
    only codepoint by codepoint, and only resulting in single code points, they
    can't be added.

    Mark

    *— Il meglio è l’inimico del bene —*

    On Sun, Jun 5, 2011 at 08:17, Karl Williamson <public@khwilliamson.com>wrote:

    > There are three pairs of characters in Unicode 6.0 in which each member of
    > the pair has a full fold to the same sequence, yet there is no simple fold
    > relation between them. They are:
    >
    > U+FB05 LATIN SMALL LIGATURE LONG S T and
    > U+FB06 LATIN SMALL LIGATURE ST
    > both fold to 'st';
    >
    > U+0390 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
    > U+1FD3 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
    > both fold to the sequence "U+03B9 U+0308 U+0301" or (the dot standing for
    > concatenation)
    > GREEK SMALL LETTER IOTA . COMBINING DIAERESIS . COMBINING ACUTE ACCENT
    >
    > U+03B0 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
    > U+1FE3 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA
    > both fold to the sequence "U+03C5 U+0308 U+0301" or
    > GREEK SMALL LETTER UPSILON . COMBINING DIAERESIS . COMBINING ACUTE ACCENT
    >
    > Under full case folding rules, each member of one of these pairs is
    > caselessly equivalent to the other member, even without adding NFD rules.
    > Correct me if I'm wrong, but shouldn't they also be caselessly equivalent
    > under simple folding rules? If so, I'm wondering what issues there would be
    > in creating an S rule for these pairs in CaseFolding.txt, so that they would
    > be considered caselessly equivalent even for applications that don't do full
    > case folding?
    >
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed Jun 08 2011 - 16:36:35 CDT