Re: HTML - i18n / NCR & charsets

From: Mark Davis (mark_davis@taligent.com)
Date: Wed Nov 27 1996 - 12:59:09 EST


Keld,

You can find information on how to get the Unicode Standard at
http://unicode.org.

Mark

unicode@Unicode.ORG wrote:
>
> unicode@Unicode.ORG writes:
>
> > >According to the HTML I18N spec, all that is needed in this case is to
> > >specify
> > >CHARSET=CP1251, and the text would be correctly converted to the equivalent
> > >Unicodes.
> >
> > The issue is not the coded content of the document, about which you are
> > correct. The issue is numeric character references of the form &nnnn.
> > Some HTML documents today use numeric references in the C1 range,
> > assuming they are the extra characters in cp1251. This is contrary to the
> > i18n spec, which states that all numeric character references refer to
> > Unicode. This means that all references in the C1 range are illegal
> > according to the spec.
>
> A sublety: the i18n spec refers to UCS, which has a consquence
> when going beyond BMP. There UCS has well defined numbers, while I
> do not know whether Unicode has this.
>
> Keld



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT