From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Mar 12 2008 - 13:03:30 CST
Karl Pentzlin suggested:
> In the code table, the character has a informative note
> "The name of this character is misleading, it does not actually join
> graphemes", without giving more information.
The more information is actually in the text of the
standard. There is a deliberate editorial policy not to extend
notes in the character names list to the paragraphs
that might be needed to explain oddballs such as this one.
>
> Is it appropriate to propose a formal alias like
> "COMBINING GRAPHEME SEPARATOR"?
I won't repeat what Asmus has already said.
But it might be pertinent to point out that the term
"SEPARATOR" in the standard is associated with
visible punctuation marks:
060D;ARABIC DATE SEPARATOR;Po;0;AL;;;;;N;;;;;
066B;ARABIC DECIMAL SEPARATOR;Po;0;AN;;;;;N;;;;;
066C;ARABIC THOUSANDS SEPARATOR;Po;0;AN;;;;;N;;;;;
10FB;GEORGIAN PARAGRAPH SEPARATOR;Po;0;L;;;;;N;;;;;
1368;ETHIOPIC PARAGRAPH SEPARATOR;Po;0;L;;;;;N;;;;;
10100;AEGEAN WORD SEPARATOR LINE;Po;0;L;;;;;N;;;;;
10101;AEGEAN WORD SEPARATOR DOT;Po;0;ON;;;;;N;;;;;
1091F;PHOENICIAN WORD SEPARATOR;Po;0;ON;;;;;N;;;;;
with white space:
180E;MONGOLIAN VOWEL SEPARATOR;Zs;0;WS;;;;;N;;;;;
2028;LINE SEPARATOR;Zl;0;WS;;;;;N;;;;;
2029;PARAGRAPH SEPARATOR;Zp;0;B;;;;;N;;;;;
with controls and format characters used in delineation
syntax:
001C;<control>;Cc;0;B;;;;;N;INFORMATION SEPARATOR FOUR;;;;
001D;<control>;Cc;0;B;;;;;N;INFORMATION SEPARATOR THREE;;;;
001E;<control>;Cc;0;B;;;;;N;INFORMATION SEPARATOR TWO;;;;
001F;<control>;Cc;0;S;;;;;N;INFORMATION SEPARATOR ONE;;;;
2063;INVISIBLE SEPARATOR;Cf;0;BN;;;;;N;;;;;
FFFA;INTERLINEAR ANNOTATION SEPARATOR;Cf;0;ON;;;;;N;;;;;
and with visible symbols for such things:
2396;DECIMAL SEPARATOR KEY SYMBOL;So;0;ON;;;;;N;;;;;
241C;SYMBOL FOR FILE SEPARATOR;So;0;ON;;;;;N;GRAPHIC FOR FILE SEPARATOR;;;;
241D;SYMBOL FOR GROUP SEPARATOR;So;0;ON;;;;;N;GRAPHIC FOR GROUP SEPARATOR;;;;
241E;SYMBOL FOR RECORD SEPARATOR;So;0;ON;;;;;N;GRAPHIC FOR RECORD SEPARATOR;;;;
241F;SYMBOL FOR UNIT SEPARATOR;So;0;ON;;;;;N;GRAPHIC FOR UNIT SEPARATOR;;;;
3037;IDEOGRAPHIC TELEGRAPH LINE FEED SEPARATOR SYMBOL;So;0;ON;;;;;N;;;;;
There is not a combining mark among them, nor is "SEPARATOR"
likely to be used in combining mark names in the future,
since one of the most salient aspects of combining marks is
that they are kept ("glued") to their base in most
processing contexts.
So no, I don't think "COMBINING GRAPHEME SEPARATOR" would
be an appropriate alias for U+034F COMBINING GRAPHEME JOINER,
much less an appropriate *formal* alias -- which, as Asmus
pointed out, is effectively a claim that the formal alias
is a normative correction of an existing, defective (but
immutable) name for a character.
My recommendation is to just get used to calling U+034F the
"CGJ" and stop worrying about what the initialism letters
stand for -- just like we don't actually spend too much
time worrying about what the letters in "ISO" stand for.
--Ken
This archive was generated by hypermail 2.1.5 : Wed Mar 12 2008 - 13:05:56 CST