Re: Surrogate support in *ML?

From: Mark Davis (markdavis@ispchannel.com)
Date: Thu Sep 07 2000 - 09:34:09 EDT

Next message: Michael \(michka\) Kaplan: "Re: Tamil glyphs"
Previous message: Michael Everson: "Re: Plane 14 redux (was: Same language, two locales)"
Maybe in reply to: Brendan Murray/DUB/Lotus: "Surrogate support in *ML?"
Next in thread: Brendan Murray/DUB/Lotus: "RE: Surrogate support in *ML?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

In HTML or XML you always use the code point (e.g. UTF-32), not a series of
code units (UTF-8 or UTF-16). Thus you would use:

𐄣

not &#xD800;&#xDD23; from UTF-16

nor ð£ from UTF-8

Mark

Brendan Murray/DUB/Lotus wrote:

> How can one encode a surrogate character as an entity in HTML/XML? Should
> it be as two separate characters or as one 32-bit value? In other words
> should it be:
> ꯍ&#xEFGH;
> or
> &#xABCDEFGH;
>
> Brendan

Next message: Michael \(michka\) Kaplan: "Re: Tamil glyphs"
Previous message: Michael Everson: "Re: Plane 14 redux (was: Same language, two locales)"
Maybe in reply to: Brendan Murray/DUB/Lotus: "Surrogate support in *ML?"
Next in thread: Brendan Murray/DUB/Lotus: "RE: Surrogate support in *ML?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT