From: James Kass (jameskass@att.net)
Date: Thu Jul 13 2006 - 08:23:08 CDT
Philippe Verdy wrote,
> An example of website that becomes horrible because of that, or that exhibits
> runtime errors in javascripts due to incorrect selection in a page that is
> clearly French and with enough text content to confirm this, including the
> domain name, and where no CJK charset should be autodetected:
> http://www.croix-rouge.fr/
> (this is the official web site for the French delegation of the Red Cross).
The French Red Cross page has no character set declaration. (Well, it has one,
but it is left blank/empty.) The very first character in the HTML file which
is not mark-up is ✚ (U+271A, HEAVY GREEK CROSS). But, that's an NCR
which is, of course, in ASCII and shouldn't affect any heuristics regarding
character sets.
Accented characters called with named HTML references (like é) display
just fine on this page while non-ASCII material seems to display as CJK ideographs.
Interestingly, setting the character set to auto-detect in MSIE 6 results in
correct display. (I normally operate with auto-detect disabled.)
In the absence of a character set declaration in the HTML, why shouldn't
a modern browser default to UTF-8? Unicode is the universal character
set and UTF-8 its most popular character set in web pages.
> Having to manually select the correct encoding when navigating a large web site
> with many pages is really irritating for users...
Which is why I normally operate with auto-select disabled. Choose the
character set which you expect to encounter most often, set the browser
to that character set, and disable auto-select. Pages correctly labelled
and served will display in their correct character sets, pages which aren't
will display in your selected default.
> ... (why doesn't IE consider the
> selected encoding of the previous page when navigating across pages of the same
> domain, when the same multiple encodings are possible candidates for the autodetection heuristic?)
If all the pages in the same large domain are equally bad, this wouldn't
solve anything. In the Red Cross example, the redirect page to the example
seems to have the same problem.
> ...but non-profit organization often lack the money and internal development team
> to make such corrections in what could be a nightmare for them to handle ...
Sounds like they need a volunteer. Perhaps someone who speaks French and
appears to have a little spare time?
Best regards,
James Kass
This archive was generated by hypermail 2.1.5 : Thu Jul 13 2006 - 10:29:25 CDT