Re: Usage of CP1252 characters on www.msnbc.com

From: Erik van der Poel (erik@netscape.com)
Date: Mon Jul 07 1997 - 21:22:40 EDT


Unicode Discussion wrote:

> They have many documents that use "smart quotes" :
> 0x93 0x201c #LEFT DOUBLE QUOTATION MARK
> 0x94 0x201d #RIGHT DOUBLE QUOTATION MARK
>
> How to represent these in HTML?
>
> 1. Convert to dumb quotes
> They would be hopping mad if these turned into "dumb" straight quotes.
>
> This may seem like a reasonable degradation to the average technical
> person, but to customers this is known as "document corruption".
>
> 2. Write out as Unicode &# NCRs. (the "correct" way)
> Unless they are using a Unicode enabled browser, these are ignored as
> noted in this mail stream. On a corporate Intranet it is conceivable
> that you could tell the customer they are required to upgrade their
> browsers to the newest ones, but they really don't like that kind of
> thing. On the Internet itself, it is obvious that only a fraction of
> people upgrade to newer browsers as you noted. Fortunately this is a
> growing fraction.
>
> 3. Write out as &#147 and &#148.
> Oh look, on all of the customer's machines these display just fine. It
>
> turns out that virtually all old browsers can understand these
> characters. There is a small % that does not (e.g. some Unix
> browsers).
> This is a problem for the external web site, but all the home users
> they
> are trying to reach can read those characters fine.

A 4th possibility would be to use the actual single-byte values from
CP1252. Strictly speaking, the document would then have to have a
charset label that corresponds to CP1252. I also think Microsoft should
register "windows-1252" as an alias for
"ISO-8859-1-Windows-3.1-Latin-1".

Erik van der Poel @ Netscape, but not speaking for Netscape



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT