RE: Eudora (was: Is there Unicode mail out there?)

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Sat Jul 14 2001 - 12:32:42 EDT


Gaute,

>
> > I have no problem sending it our with a " Windows-1252" character
> > set. If you convert to iso-8859-1 you lose characters that is just
> > as bad as sending Windows-1252 out as iso-8859-1.
>
> No. Conversion to ISO-8859-1 is better, since the result is actually
> a valid, meaningful message that reasonable mail clients can interpret
> and display correctly without difficulties. On the other hand, a
> program can not attach a meaningful value to cp1252 charactes in a
> (allegdly) ISO-8859-1 message without making guesses. For the most
> part, replacing stuff like dumbquotes with normal quotes and
> <ellipsis> with "..." is not nearly as bad as having to replace CJK
> ideograms with question marks etc.
The point that I was trying to make is that most of the characters that
don't map between 1252 and 5589-1 are the characters that you need 5589-15
for. The real solution is to map between 1252 and 5589-15 or tag the data
as "windows-1252". You might have a limited audience but at least you are
honest.

It think that the problem comes from using Windows notepad to create HTML.
It used to work for the old 1252 code page. Many sites don't use characters
that are different but the Euro introduces a glitch. If you use a Euro then
you have to convert to iso-5589-15 but what tools are you going to use on
Windows? If the browser does not support iso-5589-15 you will not see any
of the page unless the browser defaults to iso-5589-1. If this is the case
you should probably use iso-5589-15 for everything. However, if you use
1252 and don't convert to Latin9 then the good browsers will not see the
page properly.

>
> > The problem is that many browsers do not yet support iso-8859-1
>
> You're kidding, right? Or perhaps you meant -15 ?

You are right i did mean 15.
>
> > and the systems do not have iso-8859-15 fonts.
>
> That's going to change, and soon. From 2002-1-1 the Euro will replace
> national currencies _completely_ in most EU countries, and an OS that
> can not display the Euro symbol correctly will not be very useful
> anymore. IIRC Euro (i.e. ISO-8859-1) support will be mandated by the
> EU in one form or another (i.e. governmental agencies will not be
> allowed to use software that does not support the Euro sign and so
> on.)
I think you mean ISO-8859-15

>
> But off course, that's orthogonal to the issue at hand anyway. After
> all, if you use a Euro sign in your message then the recipient will
> have problems if he or she does not have a font with the necessary
> glyph installed no matter what you do. It does not matter what
> charset you're using, unless of course you're using a display system
> which is too stupid to remap font encodings on the fly.
>
> A reasonable thing to do in such circumstances is to send ISO-8859-1
> by default, and only use ISO-8859-15 if you're actually using
> characters not found in ISO-8859-1.
It does not make sense. If I am designing web sites I don't want to have to
change the character set for pages that have Euro signs. Besides the viewer
may have a different set of iso-5589-15 fonts and the look may change from
page to page. I suspect that when 1/1/2002 comes that many European sites
will take the attitude the you have better be ready. People around the
world with browsers that don't support Latin9 will have to change. I
suspect that we will see a lot of patch kits for existing browsers and mail
clients.

Just handling the conversions between Latin1, Latin9 and Windows codepages
to insure that the content is correct and matches the charset tag is a
nightmare. What happens with revisions? It is a real mess. Your support
team will make mistakes operating with three almost identical code pages.

The problem that I see it that the rest of the world especially the US is
ignoring the problem. I think that if they have to change software anyway
that it might be an excellent time to convert to UTF-8.

Carl



This archive was generated by hypermail 2.1.2 : Sat Jul 14 2001 - 13:48:51 EDT