Re: What are the issues in having U+FB06 fold to U+FB05?

From: Karl Williamson <public_at_khwilliamson.com>
Date: Sat, 11 Jun 2011 09:04:52 -0600

On 06/08/2011 03:33 PM, Mark Davis ☕ wrote:
> As to the first, it would seem reasonable. The simple folding is not
> covered by the following stability policies:
>
> http://www.unicode.org/policies/stability_policy.html#Case_Folding
> http://www.unicode.org/policies/stability_policy.html#Case_Pair
>
> However, the committee may be leery of changing these even though they
> are not covered by those policies. You can file a request form for the
> committee to consider it, at http://unicode.org/reporting.html
>
> The other two are special cases; they casefold together because of the
> way that the full case mapping is computed. Their equivalence is
> normally captured by a canonical-equivalent folding. Because the simple
> folding is only codepoint by codepoint, and only resulting in single
> code points, they can't be added.
>
I didn't understand the sentence above. But would it be fair to say
that a plausible case could be made for FB06 folding to FB05 simply, but
that there really shouldn't be a simple fold for the other two cases?

> Mark
>
> /— Il meglio è l’inimico del bene —/
>
>
> On Sun, Jun 5, 2011 at 08:17, Karl Williamson <public_at_khwilliamson.com
> <mailto:public_at_khwilliamson.com>> wrote:
>
> There are three pairs of characters in Unicode 6.0 in which each
> member of the pair has a full fold to the same sequence, yet there
> is no simple fold relation between them. They are:
>
> U+FB05 LATIN SMALL LIGATURE LONG S T and
> U+FB06 LATIN SMALL LIGATURE ST
> both fold to 'st';
>
> U+0390 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
> U+1FD3 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
> both fold to the sequence "U+03B9 U+0308 U+0301" or (the dot
> standing for concatenation)
> GREEK SMALL LETTER IOTA . COMBINING DIAERESIS . COMBINING ACUTE ACCENT
>
> U+03B0 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
> U+1FE3 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA
> both fold to the sequence "U+03C5 U+0308 U+0301" or
> GREEK SMALL LETTER UPSILON . COMBINING DIAERESIS . COMBINING ACUTE
> ACCENT
>
> Under full case folding rules, each member of one of these pairs is
> caselessly equivalent to the other member, even without adding NFD
> rules. Correct me if I'm wrong, but shouldn't they also be
> caselessly equivalent under simple folding rules? If so, I'm
> wondering what issues there would be in creating an S rule for these
> pairs in CaseFolding.txt, so that they would be considered
> caselessly equivalent even for applications that don't do full case
> folding?
>
>
>
>
>
>
Received on Sat Jun 11 2011 - 10:08:14 CDT

This archive was generated by hypermail 2.2.0 : Sat Jun 11 2011 - 10:08:14 CDT