Re: Collation contractions and reordering, was: Hebrew composition model, with cantillation marks

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Mon Nov 03 2003 - 18:26:57 EST

  • Next message: YTang0648@aol.com: "Re: charset=utf8 and Mac mailers"

    I suggest you try it out -
    http://oss.software.ibm.com/cgi-bin/icu/lx/en_US/utf-8/?_=he&EXPLORE_CollationElements

    ICU implements the UCA, including discontiguous contractions.

    markus

    Peter Kirk wrote:
    > On 03/11/2003 07:01, Kent Karlsson wrote:
    >> However, the UCA does ignore differences between order of
    >> *"non-blocking"* (**different** non-zero combining classes)
    >> combining marks **when processing contractions**.

    > But your mention of ignoring non-blocking combining marks when
    > processing contractions made me look at the newly released
    > http://www.unicode.org/reports/tr10/. I noticed there for the first
    > time, maybe because they are there for the first time, the rules S2.1.1
    > and S2.1.2 in section 4.2, and the explanatory note. If I understand
    > this correctly, it means that if a contraction is defined for shin and
    > sin dot (and no other relevant contractions), this will operate
    > successfully even if an arbitrary combination of vowels, dagesh, rafe
    > and meteg are sorted by normalisation between the sin and the sin dot.
    >
    > Is this correct? If so, I withdraw my complaint that the canonical order
    > for Hebrew makes collation impossible.
    >
    > Is this efficient? Another issue...



    This archive was generated by hypermail 2.1.5 : Mon Nov 03 2003 - 18:59:20 EST