UTC/2000-005
Submitted by Mark Davis
January 13, 2000
CodePoint | 2060, 2061 |
Name | ZERO WIDTH GRAPHEME BREAK; ZERO WIDTH GRAPHEME JOIN |
GeneralCategory | Cf Other, Format |
BlockName | 2000; 206F; General Punctuation |
BlockName | |
MarkupUse | |
InformativeAnnotations | ZWGB between two character indicates that the sequence of
characters should not be treated as a single grapheme in circumstances
where they would otherwise would be.
ZWGJ between two characters indicates that the sequence of characters should be treated as a single grapheme in circumstances where they otherwise would not be. It can also be used within longer sequences, such as s[ZWGJ]h[ZWGJ]c[ZWGJ]h. Both of these characters may affect semantics, e.g. collation behavior, spelling checking, word-match, etc. |
InformativeAnnotations | |
CollationBehavior | In the default collation ordering, these are completely
ignorable. In tailored collation ordering, these can be used to
distinguish sequences that form graphemes from those that don't. For
example, in a Slovak collation, "ch" sorts as a single collation
element after "c". Spelling a word with "c[ZWGB]h" can
be used to disable that. (Such spelling will have no effect if there is no
"ch" sequence in the collation ordering.)
Any tailored collation ordering that contains contracting elements should add ZWGJ within the sequences. E.g. Slovak should have both the following rules: c < ch; |
DecompositionClass | no decomposition |
CharacterDecomposition |
|
CanonicalOrdering | 0 - Spacing, split, enclosing, reordrant, and Tibetan subjoined |
CanonicalCombiningClass | 0 |
DecompositionClass | ON Other Neutrals |
NumericType | none no numeric value |
CanonicalCombiningClass1 |
|
CanonicalCombiningClass2 |
|
CanonicalCombiningClass3 |
|
CanonicalCombiningClass4 |
|
LineBreak | IN Inseparable |
EastAsianWidth | Neutral does not occur in EA sets |
CursiveShaping | T transparent to linking (non-spacing marks) |
ComplexShaping | Whether graphemes are joined or not should not otherwise affect cursive or ligature behavior in normal circumstances. Any exceptions to this should be specifically listed in the script descriptions. |