Re: 8859-1, 8859-15, 1252 and Euro

From: Robert A. Rosenberg (bob.rosenberg@digitscorp.com)
Date: Wed Feb 09 2000 - 12:28:58 EST


At 02:56 PM 02/07/2000 -0800, A. Vine wrote:
>Tim Greenwood wrote:
> >
> > Pretty much all of the pages on the web, and the browsers, ignore the
> > differences between ISO-8859-1 and Windows code page 1252.
>
>I wish they would! I'm pretty sick of seeing question marks where there
>should
>be quotes, apostrophes, bullets, em-dashes, etc.

The real [short term] solution is to have a preference switch that says
"Treat ISO-8859-1 as Windows-1252" so that the "undefined" (x80-x9F) range
maps to the Windows-1252 characters. Also users should send some
"clue-by-four" error messages to web sites that do not show the character
set as windows-1252 instead of ISO-8859-1 when using this character range
(ie: Show the CORRECT Character Set). IMO - A BUG REPORT to ADOBE and MS
for their Web Design products to say that use of these characters should
FORCE windows-1252 into the HTML is not out of line.

> >
> > So what is a system that stores all data in Unicode and converts for web
> > output to do with U+20AC? The formally correct process would seem to be to
> > convert to 0x80 only for CP1252 (and the other CP12xx sets) to 0xa4 for
> > ISO-8859-15 and to the 'not a character in this set' sign for ISO-8859-1.
> > This may be formally correct, but would not help the majority of users. For
> > that we would convert to 0x80 for ISO-8859-1 - it works even though
> 'wrong'.
>
>Sure, if you don't care about Unix users.
>Unless Linux does something different?
>
>Andrea
>--
>Andrea Vine, avine@eng.sun.com, iPlanet i18n architect
>A word is not a crystal, transparent and unchanging--it is the skin of a
>living thought, and may vary greatly in color and content according to the
>circumstances and time in which it is used. - Supreme Court Justice Holmes



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT