Re: Character folding in text editors from Eli Zaretskii on 2016-02-21 (Unicode Mail List Archive)

From: Eli Zaretskii <eliz_at_gnu.org>
Date: Sun, 21 Feb 2016 18:28:56 +0200

> From: Mark Davis ☕️ <mark_at_macchiato.com>
> Date: Sun, 21 Feb 2016 11:47:28 +0100
> Cc: Unicode Public <unicode_at_unicode.org>
>
> If you don't use ICU, you can also use the CLDR data directly, but you'll
> have to parse it yourself. You'd start with the root locale, then add in
> the mappings from the children (eg de.xml). The parsing is not trivial, but
> since you are only looking for equivalences (not ordering), it is somewhat
> simpler.

What about using allkeys.txt from the UCA database? Is that
equivalent to the root locale in CLDR, as far as equivalence for
searching is concerned? If not, how do these two differ? (I've read
http://www.unicode.org/reports/tr35/tr35-collation.html#Root_Collation,
but it left me not sure whether what it describes affects search
matches when secondary weights are ignored.)

Also, what is the consensus here about using UCA's decomps.txt for
folding characters when ignoring secondary and tertiary weights?
Received on Sun Feb 21 2016 - 10:30:08 CST

This archive was generated by hypermail 2.2.0 : Sun Feb 21 2016 - 10:30:08 CST