Re: UTF-8N?

From: Peter_Constable@sil.org
Date: Fri Jun 23 2000 - 09:25:46 EDT


Ken:

>Yes. The Unicode Standard will deprecate the use of U+FFEF (Note: not
U+FFFE)
>as a zero-width non-breaking space (despite its formal name).
>
>And U+FFEF should *only* be used as a byte order mark and/or signature.
(That
>is already ambiguous and trouble enough -- without tossing in the
orthogonal
>issue of the need for a non-breaking zero-width space.)

Does this mean that the text of D33 - D35 will change and that a new
normative statement will be added to the following effect?

Dnn: An initial byte sequence corresponding to U+FEFF is always interpreted
as a _byte order mark_: it is used to distinguish between two byte orders,
or as a signature to aid in identifying the encoding scheme. The _byte
order mark_ is not considered part of the content of the text. A
serialization of Unicode values into any encoding scheme may or may not
begin with a _byte order mark_.

(This is mostly a generalization of wording currently part of D35, with
references to UTF-16 made generic.)

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT