Re: HTML - i18n / NCR & charsets

From: Jonathan Rosenne (100320.1303@CompuServe.COM)
Date: Wed Nov 27 1996 - 00:44:57 EST


Alain LaBont/e'/wrote: CP1251
>This is my understanding too. I agree that it is implicitly covered in the
>UCS (in other words this space is implicitly included in the UCS as meaning
>control characters), and I would assume, in UNICODE also. This is a big
>concern for French as the way Windows extended Latin 1 8-bit table codes
><oe> OE> and <Y:> creates a huge difficulty calling for a new 8-bit code
>deprecxating Latin 1 (compatible with it, except for unused characters such
>as stand-alone accents, useless) to avoid the disappearance of those
>characters. Then we could have straightforward conversions as long as 8-bit
>code will coinhabit with the UCS. Expect such a project in ISO, beginning
>with a registration.

According to the HTML I18N spec, all that is needed in this case is to specify
CHARSET=CP1251, and the text would be correctly converted to the equivalent
Unicodes.

It should be quite clear that CP1251 is not 8859-1, just as CP1255 is not
8859-8.

Jonathan



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:32 EDT