Re: Last Call: UTF-16, an encoding of ISO 10646 to Informational

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Fri Aug 13 1999 - 14:43:03 EDT


> The IESG has received a request to consider UTF-16, an encoding of ISO
> 10646 <draft-hoffman-utf16-04.txt> as an Informational RFC. This has
> been reviewed in the IETF but is not the product of an IETF Working
> Group.
>
> The IESG plans to make a decision in the next few weeks, and solicits
> final comments on this action. Please send any comments to the
> iesg@ietf.org or ietf@ietf.org mailing lists by September 13, 1999.
>
A brief comment...

Internet standards have to do with what goes on the wire. Where character
sets are concerned, Internet standards should recognize only international
standard character sets, namely those registered in the ISO International
Register of Coded Character Sets, as is UTF-16. So far so good.

But there is no mention of byte order in the ISO registrations for UTF-16;
instead it is registered according to Level (1, 2, or 3). Internal
representation on machines of different archictures is, and should be,
irrelevant to character-set standardization.

I would suggest, therefore, that the IETF follow the categorizations in
the ISO Register, where we have a ready-made catalog of coded character
sets, along with a unique and unambiguous way in which to refer to them:
the ISO registration number.

(Of course I made the same suggestion many times in the past, and yet
Internet standards are chock full of bizarre nonstandard and proprietary
character sets and encodings that have no business in standard
vendor-neutral protocols, which gives rise to applications that feel it is
perfectly ok to (say) send e-mail in (say) Code Page 1251 and expect the
recipient to be able to read it, as long as the "charset" is announced.
But I digress.)

The IETF is in a position to legislate what flies around on the Internet
wires, and should exercise its power in this case to mandate UTF-16 in one
and only one form rather than all possible forms including "guess". Network
protocols work as intended when the agents at each end of a connection
convert between their own local format and the well-defined standard one on
the wire. Let's take this opportunity to avoid yet another imponderable.

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT