On Thu, 12 Jul 2001 DougEwell2@cs.com wrote:
> In a message dated 2001-07-11 15:03:27 Pacific Daylight Time,
> jshin@mailaps.org writes:
>
> > One exception to this should be US-ASCII because not only the repertoire
> > of US-ASCII is a subset of the repertoire of UTF-8 but also the
> > representation of all characters in US-ASCII is identical in UTF-8.
> > A smart mail client would notice that all characters
> > are in US-ASCII repertoire and label outgoing messages as in
> > US-ASCII EVEN if it's configured to label outgoing messages
> > in UTF-8
> I thought this might even be enshrined in an RFC. It certainly makes sense.
> If you are using a mailer that sends CP1252 down the wire (not that this is a
> good idea, but some mailers do this), the mailer should examine the message
> and if it only contains US-ASCII characters, the message should be tagged as
> US-ASCII. Otherwise, if it only contains ISO 8859-1, it should be tagged as
> ISO 8859-1. Only if it actually contains CP1252 characters, like smart
> quotes or long dashes, should it be tagged as CP1252. As Jungshik observed,
> the same goes for UTF-8.
I can't say it better than you did ! While focusing on
UTF-8, I forgot to mention the case involving Windows-125x, ISO-8859-x
and US-ASCII.
BTW, some broken/MIME-ignorant mail clients (e.g. Eudora for MS-Windows)
do sorta the opposite. They mislabel outgoing messages as in ISO 8859-1
while they include characters like smart quotes and long dashes. The
best would be to warn users that their messages contain those characters
outside their preferred encoding and to offer a couple of options to
choose from (use Unicode or other wider encodings or 'transliterate'
those characters with those in the repertoire of user's preferred
encoding). Short of that, at least it should label it correctly (not
that I'm in favor of sending out Windows-1252 down the wire.)
Jungshik Shin
This archive was generated by hypermail 2.1.2 : Thu Jul 12 2001 - 02:12:20 EDT