Re: Normalisation stability, was: Compression through normalization

From: Mark Davis (mark.davis@jtcsv.com)
Date: Tue Nov 25 2003 - 14:42:42 EST

  • Next message: Mark Davis: "Re: Compression through normalization"

    On my home page I have a link to a brief paper on minimal size for an NFC
    normalizer.

    http://www.macchiato.com/, see Normalization Footprint

    It was for Unicode 3.0, but the sizes shouldn't have changed much since then. It
    would add a bit of extra code for supplementaries.

    Mark
    __________________________________
    http://www.macchiato.com
    ► शिष्यादिच्छेत्पराजयम् ◄

    ----- Original Message -----
    From: "Doug Ewell" <dewell@adelphia.net>
    To: "Unicode Mailing List" <unicode@unicode.org>
    Cc: <verdy_p@wanadoo.fr>; "John Cowan" <cowan@mercury.ccil.org>
    Sent: Tue, 2003 Nov 25 11:18
    Subject: Re: Normalisation stability, was: Compression through normalization

    > Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
    >
    > > I'm not convinced that there's a significant improvement when only
    > > checking for noramlization but not perfomring it. It requires at least
    > > a list of the characters are acceptable in a normalization form, and
    > > as well their combining classes.
    >
    > UAX #15 begs to differ. See Annex 8, "Detecting Normalization Forms":
    >
    > http://www.unicode.org/reports/tr15/#Annex8
    >
    > In particular, the list of characters and derived properties, while
    > large, is *much* smaller than the complete UCD.
    >
    > I have not tested this, and don't currently plan to.
    >
    > -Doug Ewell
    > Fullerton, California
    > http://users.adelphia.net/~dewell/
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 15:31:41 EST