Re: [unicode] More ways to encode U+FEFF (was: Re: Designing a

From: David Starner (dvdeug@x8b4e53cd.dhcp.okstate.edu)
Date: Wed Sep 06 2000 - 16:22:32 EDT


On Wed, Sep 06, 2000 at 08:13:41AM -0800, Markus Scherer wrote:
> of this list, only UTF-EBCDIC is a viable encoding form.
> the others are either deprecated, never made it beyond draft, or are unofficial discussion pieces that never made it anywhere (i proposed one of them :-).
>
> if you detect all the big- and little-endian boms for the standard forms
> utf-8, utf-16, utf-32, scsu, utf-ebcdic
> then you will be a hero. any of them may come with a bom depending on protocol and os.
>
> markus
>
> David Starner wrote:
> > > UTF-1: F7 64 4C
> > > UTF-7: 2B 2F 76 38 2D "+/v8-"
> > > UTF-7d5: BF FB FF
> > > UTF-8C1: BB ED DF
> > > UTF-9: 93 FD FF
> > > UTF-EBCDIC: DD 73 66 73
> > > UTF-mu(2): 9F 9B FF
> > > UCN(3): 5C 75 66 65 66 66 "\ufeff"
> > > DUCK(4): 81 FE FF

I realize some of these were more discussion pieces; honestly, I was
planning on implementing SCSU, UTF-1, UTF-7 and 8/16/32 BE/LE. Why
UTF-EBCDIC? I would think that UTF-7 is more common in use, as once in
a while you'll run across it in mail and newsgroups. I feel a need to
at least UTF-7, in case someone wants to write a mail reader with Ngeadal.

-- 
David Starner - dstarner98@aasaa.ofe.org
http/ftp: dvdeug.dhis.org
I knew all of the floors in my high school, and none of the ceilings.
	- Chris Painter



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT