From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Dec 08 2004 - 19:23:11 CST
Marcin asked:
> The general trouble is that numeric character references can only
> encode individual code points
By design.
> rather than graphemes (is this a correct
> term for a non-combining code point with a sequence of combining code
> points?).
No. The correct term is "combining character sequence".
TUS 4.0, p. 70, D17.
The correct NCR representation of a combining character sequence
is a sequence of NCR's. -- Not too surprisingly.
--Ken
> So if XML is supposed to be treated as a sequence of
> graphemes, weird effects arise in the above boundary cases...
This archive was generated by hypermail 2.1.5 : Wed Dec 08 2004 - 19:28:21 CST