From: verdy_p (verdy_p@wanadoo.fr)
Date: Mon Nov 23 2009 - 23:04:04 CST
The definition is correct, and explained in the table which says "A single base character is **not** a combining
character sequence."
The table makes distinctions between the four cases, defined without overlaps, that can make (when joined
**together** in a union) a single grapheme cluster.
Your conclusion is wrong, because a single letter 'A' is defined as a "legacy grapheme cluster" and a "legacy
grapheme cluster ***is*** a grapheme cluster:
( CRLF
| ( Hangul-syllable | !Control )
Grapheme_Extend*
| . )
because it matches "!Control". The same row in the table says that "A single base character is a grapheme cluster".
And this is also said at the before in section the section 3, just below table 1a:
"A legacy grapheme cluster is defined as a base (such as A or カ) followed by zero or more continuing characters."
The "legacy rgapheme cluster" are the simplest and most common forms of grapheme clusters recognized in almost all
applications. don't interpret "legacy" as meaning "included just for comaptibility", or meaning "still supported but
not recommended", it just means the most limitative definition used in most legacy applications that don't recognize
the other forms.
The same can be said about the extended grapheme clusters that **are** also grapheme clusters.
Philippe.
> Message du 24/11/09 03:05
> De : "karl williamson"
> A : "unicode@unicode.org"
> Copie à :
> Objet : ? Wrong definitions for combining character sequence in tr 29
>
>
> It is defined as
> base? ( Mark | ZWJ | ZWNJ )+
>
> That means that a mark is required. So the letter 'A' is not a grapheme
> cluster.
>
> Similarly for the definition for the extended
>
>
>
This archive was generated by hypermail 2.1.5 : Mon Nov 23 2009 - 23:07:45 CST