From: Karl Pentzlin (karl-pentzlin@acssoft.de)
Date: Thu Apr 13 2006 - 15:57:08 CST
Am Donnerstag, 13. April 2006 um 22:27 schrieb Kenneth Whistler:
KW> Karl Pentzlin inquired:
>> I try to understand whether "E" + CGJ + "s" + CGJ + "c" + U+20E3
>> COMBINING ENCLOSING KEYCAP should produce a representation of an
>> "Esc" key in plain text (given an appropriate font rendering
>> mechanism).
KW> It should not. ...
Therefore, I propose to add an entry to the Unicode FAQ (section
"Characters, Combining marks"), something like the following
(of course only if my understanding of the mechanism is correct now):
Q: Is it possible to apply a diacritic or combining enclosing mark to
a sequence of more than one (non-combining) character?
A: No, with the exception of the "double diacritics" deliberately
designed to be applied onto a two letter sequence, e.g. U+035D
COMBINING DOUBLE BREVE.
Neither ZWJ (U+200D ZERO WITDH JOINER) nor CGJ (U+034F COMBINING
GRAPHEME JOINER) "glue" characters together in a way that the scope of
any following combining character would be affected.
To get a character sequence like "Esc" into something like the U+20E3
COMBINING ENCLOSING KEYCAP, you must resort to higher-level protocols.
-- KW> A CGJ by itself is simply a defective combining character sequence. KW> A CGJ does not *construct* grapheme clusters, if that is what you KW> are getting at. Maybe you can add in the FAQ entry: "Q: Does U+034F COMBINING GRAPHEME JOINER join graphemes?" a statement like "Especially, it cannot be used to *construct* grapheme clusters out of arbitrary character sequences, or extend the scope of subsequent combining characters.". - Karl Pentzlin
This archive was generated by hypermail 2.1.5 : Thu Apr 13 2006 - 16:04:05 CST