Tailored normalization (was RE: Public Review Issues update)

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Tue Feb 04 2003 - 10:11:18 EST

  • Next message: Peter_Constable@sil.org: "Re: Tailored normalization (was RE: Public Review Issues update)"

    Rick McGowan wrote:
    > Please note that the Issues for Public Review have been
    > updated with a new review item regarding tailoring of
    > normalization. Please see issue number 7 on this page:

    "The UTC is considering allowing limited tailoring of normalization forms."

    My €0.02 worth comment:

    Issue 7 is to be rejected because it is useless. It is trying to allow what
    is already allowed and could not possibly be forbidden.

    It has always been possible to invent alternative "normalization" schemes,
    similar in principle, but not identical to any of the four Unicode standard
    Normalization Forms. This is part of the processing that an application is
    allowed do to text, and that an user may expect a certain application to
    perform.

    E.g., the purpose of a certain program can be to convert traditional CJK
    ideographs to simplified ones, or to transliterate one script into another,
    or to change all uppercase letters to lowercase.

    Some of these character-level operations can work in a very similar way to
    the standard normalization forms (and, maybe, even reuse the same library
    functions) but, IMHO, there is no need that the Unicode Standard explicitly
    authorizes, endorses or even just acknowledges the existence of these
    private normalization schemes.

    IMHO, if you need to do such a non-standard normalization scheme, just do
    it. But invent your own name for it: don't call it "tailored Unicode NFxx".

    _ Marco



    This archive was generated by hypermail 2.1.5 : Tue Feb 04 2003 - 10:56:59 EST