Re: Is there Unicode mail out there?

From: Gaute B Strokkenes (gs234@cam.ac.uk)
Date: Fri Jul 13 2001 - 21:06:26 EDT


On Fri, 13 Jul 2001, Mike_Ayers@bmc.com wrote:
>
>> From: 11@onna.com [mailto:11@onna.com]
>
>> Those are MOJIBAKE for my SIG.
>
> Which is what you deserve for not sending UTF-8. Until you
> upgrade your mailer, your name wil be "?@?š‚¶‚イ‚¢‚Á‚¿‚á‚ñ?š". :-p

No way. Any mail client that is sufficiently clever to understand
UTF-8 should understand all valid and registered MIME-charsets. After
all, conversion libraries are both widely available and easy to use.
[I can see you put a smiley after your statement so I realise you were
probably being sarcastic, but I thought that this could bear pointing
out.]

All the `all messages should be in UTF-8, even when there are
well-established legacy encodings that cover the characters of a given
message' mumbo-jumbo that has been mentioned recently on the list is
really just so much hot air. Firstly, mail clients will not be able
to deprecate support for other charsets even if UTF-8 is widely
adopted (which it isn't--for email) because of the need to be able to
interpret the masses of existing messges. Secondly, maintaining such
support is, as pointed out above, extremely easy to do. Thirdly,
there are a great number of clients out there that do not support
UTF-8 and are unlikely to do so in the immediate future, either
because of internal limitations in the software that are hard to
remove or because people don't upgrade. I think it's antisocial to
say `Well, I _could_ have used a charset that would have enabled you
to read my message but I decided not to, for no particularly good
reason.'

On the other hand it makes sense to say `Sorry, but UTF-8 is the only
charset that will do since I wanted to use Etruscan, Russian and
Japanese characters and UTF-8 is the only sane way to do this.'
That's the only benefit that Unicode and UTF-8 will bring to email:
the ability to mix and match characters from all scripts of all sizes
and shapes in a single message. OTOH, for those of us who need this
it's a big advantage.

Another thing that some people may worry about is the bad interaction
between quoted-unprintable and UTF-8 (or any non-West European / North
American coding in general, but for UTF-8 it's even worse): 6 bytes
for a single Cyrillic character? Ye gods. [I could start another
rant about how bad an idea QP was in the first place, but that's
off-topic here.]

-- 
Gaute Strokkenes                        http://www.srcf.ucam.org/~gs234/
I am NOT a nut....



This archive was generated by hypermail 2.1.2 : Fri Jul 13 2001 - 21:45:18 EDT