Re: Unicode denormalizer

From: Markus Scherer (markus.icu@gmail.com)
Date: Wed Oct 06 2010 - 14:14:12 CDT

  • Next message: Saqqara: "OpenType update for Unicode 5.2/6.0?"

    On Wed, Oct 6, 2010 at 8:49 AM, Mark Davis ☕ <mark@macchiato.com> wrote:

    > ICU has a canonical iterator, one that provides all the strings that
    > produce the same result under toNFC(...).

    The algorithm is here:
    http://www.unicode.org/notes/tn5/#Enumerating_Equivalent_Strings
    <http://www.unicode.org/notes/tn5/#Enumerating_Equivalent_Strings>API:
    Search for "ICU CanonicalIterator" (without the quotes).

    As Mark said, this is limited to canonical equivalences (NFC/NFD) but if
    necessary we could extend it to arbitrary normalization forms.

    If you need additional support from ICU then please move this discussion
    there: http://site.icu-project.org/

    markus



    This archive was generated by hypermail 2.1.5 : Wed Oct 06 2010 - 14:17:29 CDT