From: verdy_p (verdy_p@wanadoo.fr)
Date: Mon Sep 07 2009 - 07:50:31 CDT
> De : "Shriramana Sharma"
> A : "unicode@unicode.org"
> Copie à :
> Objet : Request clarification on disunification based on different character properties
>
>
> Hello. Again the disunification question. P 29 of the P&P document:
>
> If a character disunification cannot be achieved by adding one
> new character without requiring a change in very significant properties
> of the existing character and without changing the representative glyph
> or range of expected glyphs for the existing character, then new
> characters will be added for each of the distinct, specific letterforms
> required.
>
> This positively confuses me. The text IMHO could have been clearer.
He's not alone to find the formulation confusive, for two reasons:
(1) the condition starts by a self-contradiction of the condition: "a character disunification cannot be achieved by
adding a new character" is a self-contradiction, because adding a new character (for the purpose indicated later on)
means that a disunification has occured, so it effectively CAN be achieved by this addition, and the sentences says
the opposite using CANNOT.
(2) it uses a double negation in the later on: "cannot" and "without", also within the condition (which should be
avoided in all recommendations, notably within the conditions for another assertion)
I can perfectly understand the principles implicitly conveyed by such sentence, but anyway, without omitting any of
the conditions expressed there, the formulation has to be clearer. The two problems above can be solved. And in fact
it is the placement of the "without" word which is confusive. It would simply be clearer if the "without" was moved
to the place where "by" is used, just before "adding" (and consequently changing an "and" into an "or", for logical
reasons):
"If a character unification cannot be maintained without changing very significant properties of the existing
character and without changing the representative glyph or range of expected glyphs for the existing character, then
new characters will be added for each of the distinct, specific letterforms required."
The other source of confusion is the lack of definition of "very significant properties". Here I think that these
properties should be defined at least as all those that are subject to the encoding stability rules, and mandatory
properties in the UCD or in the ISO 10646 charmaps (this includes the selected representative glyph even though it
is not directly subject ot the stagbility rule and some limited variation may occur to fix some interpretation
problems; on the opposite the mandatory assigned character name is stable but is not descriptive enough to require a
new addition/disaunification, even if this assigned name is confusive or wrong):
- so, within the ISO 10646 context, it just remains the distinction offered by the encoding within a distinctive
block assigned for a specific script
- in the context of Unicode, the general category is also important, as well as other stable properties like the
assigned equivalents or compatibility equivalents, and the normalized forms, and the basic case mappings.
- the collation problems are not sufficient to require disunification unless there are contrasting pairs within the
same language that would require an additional distinction, and no other encoding convnetions can be used (such as
the use of (dis)joiners in Indic scripts)
This archive was generated by hypermail 2.1.5 : Mon Sep 07 2009 - 07:54:57 CDT