Re: NFC

From: Jon Hanna (jon@hackcraft.net)
Date: Wed Feb 01 2006 - 09:59:03 CST

  • Next message: Mark Davis: "Re: NFC"

    Tim Greenwood wrote:
    > Annex 8 of UAX #15 (Normalization Forms) describes the quick lookup
    > property of Yes/No/Maybe for determining if a string is NFC. When I
    > get a 'Maybe' is it sufficient to do the fuller analysis from the
    > previous 'Yes' character? In other words (I think) is the previous
    > 'yes' character a stable NFC code point? From the annex it seems to be
    > not, but I cannot think of an example.

    The stable NFC code-points are those which are both "Yes" for the quick
    checks, and have a combining class of 0.

    Remember that the quick check tests both the derived normalisation
    property (yes/no/maybe) and also that the comining marks are in
    canonical order. If you have a "maybe" with a combining class of 0 you
    will have to search forwards upto, but not including, the next character
    with a combining class of 0. If you have a "maybe" combiner following a
    "yes" character with a combining class of 0 you need to check if those
    two characters have a canonical composition.

    In practice the gain from further optimising is going to be slight.



    This archive was generated by hypermail 2.1.5 : Wed Feb 01 2006 - 10:05:01 CST