re: Request clarification on disunification based on different character properties

From: verdy_p (verdy_p@wanadoo.fr)
Date: Mon Sep 07 2009 - 07:50:31 CDT

  • Next message: Ed Trager: "Re: Run-time checking of fonts for Sinhala support"

    > De : "Shriramana Sharma"
    > A : "unicode@unicode.org"
    > Copie à :
    > Objet : Request clarification on disunification based on different character properties
    >
    >
    > Hello. Again the disunification question. P 29 of the P&P document:
    >
    > If a character disunification cannot be achieved by adding one
    > new character without requiring a change in very significant properties
    > of the existing character and without changing the representative glyph
    > or range of expected glyphs for the existing character, then new
    > characters will be added for each of the distinct, specific letterforms
    > required.
    >
    > This positively confuses me. The text IMHO could have been clearer.

    He's not alone to find the formulation confusive, for two reasons:
    (1) the condition starts by a self-contradiction of the condition: "a character disunification cannot be achieved by
    adding a new character" is a self-contradiction, because adding a new character (for the purpose indicated later on)
    means that a disunification has occured, so it effectively CAN be achieved by this addition, and the sentences says
    the opposite using CANNOT.
    (2) it uses a double negation in the later on: "cannot" and "without", also within the condition (which should be
    avoided in all recommendations, notably within the conditions for another assertion)

    I can perfectly understand the principles implicitly conveyed by such sentence, but anyway, without omitting any of
    the conditions expressed there, the formulation has to be clearer. The two problems above can be solved. And in fact
    it is the placement of the "without" word which is confusive. It would simply be clearer if the "without" was moved
    to the place where "by" is used, just before "adding" (and consequently changing an "and" into an "or", for logical
    reasons):

    "If a character unification cannot be maintained without changing very significant properties of the existing
    character and without changing the representative glyph or range of expected glyphs for the existing character, then
    new characters will be added for each of the distinct, specific letterforms required."

    The other source of confusion is the lack of definition of "very significant properties". Here I think that these
    properties should be defined at least as all those that are subject to the encoding stability rules, and mandatory
    properties in the UCD or in the ISO 10646 charmaps (this includes the selected representative glyph even though it
    is not directly subject ot the stagbility rule and some limited variation may occur to fix some interpretation
    problems; on the opposite the mandatory assigned character name is stable but is not descriptive enough to require a
    new addition/disaunification, even if this assigned name is confusive or wrong):
    - so, within the ISO 10646 context, it just remains the distinction offered by the encoding within a distinctive
    block assigned for a specific script
    - in the context of Unicode, the general category is also important, as well as other stable properties like the
    assigned equivalents or compatibility equivalents, and the normalized forms, and the basic case mappings.
    - the collation problems are not sufficient to require disunification unless there are contrasting pairs within the
    same language that would require an additional distinction, and no other encoding convnetions can be used (such as
    the use of (dis)joiners in Indic scripts)



    This archive was generated by hypermail 2.1.5 : Mon Sep 07 2009 - 07:54:57 CDT