L2/02-035
To: | UTC |
Re: | Definition of Canonical Composite |
From: | Mark Davis |
Date: | 2001-01-23 |
I propose that we make the following addition to the end of 3.6 Decomposition (revision) in Unicode 3.2.
Add the following text after D23:
D23a Canonical composite: a character which is not identical to its canonical decomposition.
Category | Examples |
---|---|
non-composite | a, b, c, ... |
canonical composite | a-grave (à), kelvin sign (K), ... |
compatibility composite | micro-sign (µ), square mil (㏕), ... |
canonical & compatibility composite | greek upsilon with acute and hook symbol (ϓ), ... |
We define decomposable character in D18. We say that it is equivalent
to one or more characters according to the decomposition mappings. We give composite
and precomposed as aliases for this term. However, we do not, in that
definition, distinguish between the kinds of decompositions that determine a
decomposible/composite.
To resolve an ambiguity in our terminology, in 3.2 we are using the term compatibility
composite in D21.
That distinguishes a particular subset of composites, based on the kind of
decomposition. The term canonical composite distinguishes a different
subset of composites, based upon the other kind of decomposition. While it is in
some sense is a natural fallout from the definitions of canonical and composite,
it is important enough that we should have a formal definition.
People too easily misuse the term composite (or precomposed) to mean only canonical composite -- I've seen this on a number of occasions. A formal definition -- and inclusion of the table with examples -- will help to reduce that confusion.