Re: Does Unicode 4.1 change NFC?

From: Peter Kirk (peterkirk@qaya.org)
Date: Sun Apr 03 2005 - 16:36:15 CST

  • Next message: Michael Everson: "Re: Sindhi characters proposed"

    On 03/04/2005 22:28, Doug Ewell wrote:

    >Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
    >
    >
    >
    >>>Yes. New CJK compatibility ideographs U+FA70..U+FAD9 have canonical
    >>>decompositions into single characters. For example NFC(U+FACF) =
    >>>U+2284A (for the first time a BMP character is normalized to
    >>>something outside BMP).
    >>>
    >>>
    >>Isn't that against Unicode statibility? Shouldn't it have been the
    >>reverse, keeping U+FACF stable and normalizing U+2284A to U+FACF to
    >>keep the compatibility? If this was added because of a past error,
    >>then this MUST be urgently documented.
    >>
    >>
    >
    >They're new characters, Philippe. They weren't encoded until 4.1.
    >
    >
    >
    In that case these character allocations seem perverse, given that both
    of these characters could have been assigned to the BMP, or both to
    outside it - or the reverse normalisation as suggested by Philippe.
    There is a serious danger of breaking existing implementations
    (especially those which only fully support the BMP) by introducing a BMP
    character which normalises to outside the BMP. For the BMP is now no
    longer a closed subset of Unicode, under operations like normalisation
    which existing implementations expected to find closed. Maybe someone
    thought this was a good idea, to force implementations to be upgraded,
    but it strikes me as a recipe for disaster. It could also be a serious
    security hole, as hackers try sending U+FACF to various implementations
    in an attempt to crash them.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    -- 
    No virus found in this outgoing message.
    Checked by AVG Anti-Virus.
    Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 01/04/2005
    


    This archive was generated by hypermail 2.1.5 : Sun Apr 03 2005 - 16:36:46 CST