Re: Is there Unicode mail out there?

From: Lars Marius Garshol (larsga@garshol.priv.no)
Date: Tue Jul 17 2001 - 05:15:05 EDT


* Michael Everson
|
| Perhaps I have been asleep, but is that notation (&#Xxxxx;) valid
| HTML for all Unicode characters?

The numeric character reference syntax is defined by SGML, and just
referenced by HTML, and in SGML it is defined in terms of the document
character set, which is defined by the SGML declaration used by each
SGML application (of which HTML is one instance).

The numeric character reference syntax can be used to refer to any
character in the document character set (as declared by the SGML
declaration used by HTML[1]). The document character set used by HTML
is Unicode, but some characters have been disallowed, and may not
appear in documents, whether directly or by reference. These are

 U+0000 - U+0009
 U+000B - U+000C
 U+000E - U+0019
 U+007F - U+009F
 U+D800 - U+DFFF

--Lars M.

[1] <URL: http://www.w3.org/TR/html401/sgml/sgmldecl.html >



This archive was generated by hypermail 2.1.2 : Tue Jul 17 2001 - 06:18:14 EDT