In HTML or XML you always use the code point (e.g. UTF-32), not a series of
code units (UTF-8 or UTF-16). Thus you would use:
𐄣
not �� from UTF-16
nor 𐄣 from UTF-8
Mark
Brendan Murray/DUB/Lotus wrote:
> How can one encode a surrogate character as an entity in HTML/XML? Should
> it be as two separate characters or as one 32-bit value? In other words
> should it be:
> ꯍïGH;
> or
> �GH;
>
> Brendan
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT