BOMbs (was Re: Private Use Surrogate Pairs)

From: i18nGuy Tex Texin (tex@i18nguy.com)
Date: Thu May 09 2002 - 02:23:56 EDT


Doug,
Although it can help prevent that confusion, for it to be a *good
reason*, it first has to be shown (or believed) that not only is there a
need for an indicator of endian-ness, but there is also a need for a
(weak) encoding indicator.

Second, it has to be shown (or believed) that the indicator should be
this particular value 00 00 FE FF and not another one that doesn't offer
this potential confusion to begin with.

I can buy endian-ness. I am not sold on (weak) encoding signatures.

hth
tex

Doug Ewell wrote:
>
> Peter_Constable at sil dot org wrote:
>
> > I think Jim is asking for clarification in the text of the Standard
> > and not just in a response to him, but in case anyone isn't sure,
> > the four that are excluded are U+FFFFE, U+FFFFF, U+10FFFE and
> > U+10FFFF.
> >
> > And don't bother asking for a good reason *why* they are excluded:
> > there isn't any good reason why; they just are.
>
> I know it's popular to say there's no good reason for these to be
> excluded, but at least excluding the U+xxFFFE code points helps prevent
> UTF-32LE from being detected as big-endian UTF-16 with BOM:
>
> Big-endian UTF-16: FE FF .. ..
> U+xxFFFE in UTF-32LE: FE FF xx 00
>
> -Doug Ewell
> Fullerton, California

-- 
-------------------------------------------------------------
Tex Texin
mailto:Tex@i18nGuy.com
http://www.i18nGuy.com
-------------------------------------------------------------
What's wrong with locales?
http://www.i18nguy.com/locales/index.html

################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# #################################################################



This archive was generated by hypermail 2.1.2 : Thu May 09 2002 - 03:25:56 EDT