Re: UTF-8 was Re: Smiles, faces, etc

From: Doug Ewell (dewell@adelphia.net)
Date: Sun Feb 17 2002 - 19:23:20 EST


Curtis Clark <jcclark@mockfont.com> wrote:

> At 08:30 PM 2/14/02, David Starner wrote:
> >One out of two ain't bad, I guess. That was garbage on the screens of
> >some of the subscribers, though - UTF-8 display is still not
universal.

You have a UTF-8 sig block, right, David? :-)

With my recent change in service providers, I am now using the feared
and loathed Outlook Express and can read UTF-8 e-mails just fine, but
previously it would have been garbage on my screen as well.

> That's why I always open SC Unipad when I read this list, and paste as
> UTF-8. Unfortunately, Unipad seems to choke when one of the bytes of a
> UTF-8 sequence is 20h.

That makes sense. 20h is not valid in a UTF-8 multibyte sequence.

The notes after conformance clause C12 that were added in Unicode 3.1
allow a process to try to patch up UTF-8 text that has been "mangled"
(technical term) by inserting CRLF, or I suppose 20h, in the middle of a
sequence. UniPad doesn't try to repair such data.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Sun Feb 17 2002 - 18:44:46 EST