L2/11-266
Source: Mark Davis
Date: July 6, 2011
Subject: Handling of Cn/Cs/Co characters in UAX #29
Karl Williamson
raised the issue of why the sequence <Cn + Extend> forms a grapheme cluster.
While these are degenerate cases, the UTC should consider whether overall
behavior would be better if we added the three odd-ball cases ([:cn:][:cs:][:co:])
to [:gcb:control:].
It would also make the usage align more
with the current definition of the Grapheme_Base property. That
is, if we added [:cn:][:cs:][:co:] to [:gcb:control:], then the definition
of Grapheme_Base is equivalent to all codepoints outside of [[:gcb:extend:][:gcb:lf:][:gcb:cr:][:gcb:control:]].