From: Doug Ewell (doug@ewellic.org)
Date: Tue Feb 09 2010 - 07:29:11 CST
"verdy_p" <verdy underscore p at wanadoo dot fr> wrote:
> If the algorithm takes the ISO 8859-x tag unreliable because the page
> contains some Windows 125x characters (in the code range 0x80-0x9F),
> it is probably wrong: assume Windw 125x instead and use it as the
> secondary indicator (after the statistic estimation euristic).
That's what I said. Maybe if browsers followed this strategy anyway,
the authors of HTML5 wouldn't have felt it necessary to demand it.
Note, though, that there are only a few Windows single-byte code pages
which differ from ISO 8859 counterparts only by adding glyphs in the C1
range. The relationship between 1252 and 8859-1 is well known, and 1254
and 8859-9 are like this as well, but most are not; 1250 and 8859-2, for
example, have numerous other differences.
-- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s
This archive was generated by hypermail 2.1.5 : Tue Feb 09 2010 - 07:34:19 CST