Re: [unicode] More ways to encode U+FEFF (was: Re: Designing a multilingual

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Wed Sep 06 2000 - 12:20:32 EDT


of this list, only UTF-EBCDIC is a viable encoding form.
the others are either deprecated, never made it beyond draft, or are unofficial discussion pieces that never made it anywhere (i proposed one of them :-).

if you detect all the big- and little-endian boms for the standard forms
    utf-8, utf-16, utf-32, scsu, utf-ebcdic
then you will be a hero. any of them may come with a bom depending on protocol and os.

markus

David Starner wrote:
> > UTF-1: F7 64 4C
> > UTF-7: 2B 2F 76 38 2D "+/v8-"
> > UTF-7d5: BF FB FF
> > UTF-8C1: BB ED DF
> > UTF-9: 93 FD FF
> > UTF-EBCDIC: DD 73 66 73
> > UTF-mu(2): 9F 9B FF
> > UCN(3): 5C 75 66 65 66 66 "\ufeff"
> > DUCK(4): 81 FE FF



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT