Re: ISO European characters in Unicode?

From: addison@inter-locale.com
Date: Wed Oct 04 2000 - 19:35:30 EDT


Hi George,

Changing the display setting for your browser doesn't change how the page
is encoded. The page is still encoded in ISO-8859-1 (Latin-1) character
set. Changing your view encoding will only mess up the display.

In an earlier post on this thread I suggested that sending a file encoded
as UTF-8 to a browser will work reliably for Western European users... and
it will.

Try this search out:

http://search.ie.altavista.com/cgi-bin/query?pg=q&sc=on&cn=ie&cl=en&q=%C3%BCberpr%C3%BCfen&kl=XX&what=pg.ie

This particular AltaVista site will return a UTF-8 page (you can elect to
make it return something else by clicking on customize... if it returns a
Latin-1 page because you've visited AltaVista before, click on Customize
to change it to UTF-8).

You can clearly see umlauts and other fine German non-ASCII characters on
this page. You can see the underlying encoding by changing the page
encoding to Latin-1. See how the German word "zuruck" (I'd spell
it with the umlaut, but I'm working on the Wyse50 today) acquires two
garbage characters in the middle? That's the UTF-8 bytes displaying
themselves as 8-bit characters.

So:

*IF* the page is already encoded as UTF-8 and tagged with a META tag, you
will see it in a Western European default installation browser of recent
enough vintage to support Unicode.
If the page is encoded as something else, you have to look at it in *THAT*
encoding.

Best Regards,

Addison

===========================================================
Addison P. Phillips Principal Consultant
Inter-Locale LLC http://www.inter-locale.com
Los Gatos, CA, USA mailto:addison@inter-locale.com

+1 408.210.3569 (mobile) +1 408.904.4762 (fax)
===========================================================
Globalization Engineering & Consulting Services

On Wed, 4 Oct 2000, George Zeigler wrote:

> Hello,
> I looked at the Yahoo page given here:
> http://dir.yahoo.com/Regional/Countries/Norway/Cities_and_Towns/
>
> It has some characters used in Norwegian such as the O with a slash
> throughit. I changed the character set to Unicode (UTF-8) and it no longer
> showed up properly. I thought Unicode showed ISO pages for European sites with
> out any problems. Where have I erred?
>
> George
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT