Re: Is there Unicode mail out there?

From: Jungshik Shin (jshin@mailaps.org)
Date: Wed Jul 11 2001 - 17:38:33 EDT


On Wed, 11 Jul 2001 Peter_Constable@sil.org wrote:

> >> Unicode (UTF-8)". Just as a test, here's a bit of Thai: ¡Ƒ»’΂ڨ↩ō
> > Your mail has the following header, which indicates that
> >it's in 'Windows-874' encoding. I'm not sure whether that encoding name
> >is registered with IANA for use in MIME.
> >
> >> X-Mailer: Lotus Notes Release 5.0.5 September 22, 2000
> >> MIME-Version: 1.0
> >> Content-type: text/plain; charset=Windows-874
>
> OK, I didn't look closely at the header, just at the result. Here's another
> test that will be telling - I don't know of any codepage / charset for
> Ethiopic: ሀሁሂሃ

   Yes, this time you made it :-)

> X-Mailer: Lotus Notes Release 5.0.5 September 22, 2000
> MIME-Version: 1.0
> Content-type: text/plain; charset=UTF-8

  Now the question is whether it's possible to force Lotus Notes
to use UTF-8 as the encoding of the outgoing message EVEN WHEN
characters in the message are all covered by existing
encoding other than UTF-8 (e.g. Windows-874 for Thai).

 One exception to this should be US-ASCII because not only the repertoire
of US-ASCII is a subset of the repertoire of UTF-8 but also the
representation of all characters in US-ASCII is identical in UTF-8.
A smart mail client would notice that all characters
are in US-ASCII repertoire and label outgoing messages as in
US-ASCII EVEN if it's configured to label outgoing messages
in UTF-8 (or any superset of US-ASCII like EUC-KR, ISO-2022-JP,
GB2312-80 - a better term is certainly EUC-CN but it's not
registered with IANA and GB2312-80 got too widely-spread beyond
remedy-, ISO8859-[1-9,15]). There's no violation of standards
in NOT doing this, but doing this would for sure reduce
the possibility of unnecessary 'red-flag' raised by some mail clients on
the recipient's side. Unfortunately, MS OE and Netscape-Mail
are not smart in this regard while Pine and Mutt are.

  Jungshik Shin

P.S.How about making a sort of resolution to recommend that anybody
writing to this list should use UTF-8 *if /when* possible?
This was suggested in the past, but we're still getting
a lot of messages in ISO-8859-1 and other encodings.



This archive was generated by hypermail 2.1.2 : Wed Jul 11 2001 - 18:55:28 EDT