Re: Is there Unicode mail out there?

From: Tex Texin (texin@progress.com)
Date: Thu Jul 12 2001 - 11:51:48 EDT


(I didnt read all the thread so maybe I missed a step).

So the proposal is that minimizing the charset is a good thing?

This means that you and I start out in a conversation about a
product I am trying to sell you, it happens to be all in ascii
and we exchange several mails successfully. Then I quote you
a price in Euros and my 1252 message gets corrupted by your
reader which can handle either only 8859-1 or ASCII, and
you miss the fact that the Euro is corrupted and think we
are talking dollars or some other currency.

Although I understand why you would want a minimal charset in order
to not needlessly prevent communications, the implication of
reliability and trust that is built by having some success is
a problem. You think you are communicating successfully but when it
is critical it may not...

Perhaps if a harder line was taken when characters
are used that cannot be converted, this would make more sense.
(ie give a very clear recognizable indication of corruption or
conversion failures)

tex

DougEwell2@cs.com wrote:
>
> In a message dated 2001-07-11 15:03:27 Pacific Daylight Time,
> jshin@mailaps.org writes:
>
> > One exception to this should be US-ASCII because not only the repertoire
> > of US-ASCII is a subset of the repertoire of UTF-8 but also the
> > representation of all characters in US-ASCII is identical in UTF-8.
> > A smart mail client would notice that all characters
> > are in US-ASCII repertoire and label outgoing messages as in
> > US-ASCII EVEN if it's configured to label outgoing messages
> > in UTF-8
> [...]
>
> I thought this might even be enshrined in an RFC. It certainly makes sense.
> If you are using a mailer that sends CP1252 down the wire (not that this is a
> good idea, but some mailers do this), the mailer should examine the message
> and if it only contains US-ASCII characters, the message should be tagged as
> US-ASCII. Otherwise, if it only contains ISO 8859-1, it should be tagged as
> ISO 8859-1. Only if it actually contains CP1252 characters, like smart
> quotes or long dashes, should it be tagged as CP1252. As Jungshik observed,
> the same goes for UTF-8.
>
> -Doug Ewell
> Fullerton, California

-- 
---------------------------------------------------------------
Tex Texin                      Director, International Business
mailto:Texin@Progress.com      +1-781-280-4271
Fax:+1-781-280-4655
the Progress Company           14 Oak Park, Bedford, MA 01730
---------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Thu Jul 12 2001 - 12:46:18 EDT