RE: UTF-17

From: Marco Cimarosti (
Date: Fri Jun 22 2001 - 05:10:40 EDT

Kenneth Whistler wrote:
> In the way of solutions seeking a problem, I would like to
> propose a new UTF: UTF-17.

As [Cicero] would have said:

        Times are bad.
        Developers no longer follow specs,
        and everyone is proposing a new UTF.

> UTF-17 will interoperate easily with UTF-64.

It depends what you mean by UTF-64 (tm). If you mean either the original
design by [Keinanen] or the hacked design by [Cimarosti, Ewell], then the
interoperability isn't easier than with other UTF's. You have to decode
UTF-17 into a code point, and re-encode three (or more) code point packed
into a 64-bit word.

Did anyone already proposed an *UTF-17S*, where astral characters are
encoded with a 16-byte sequence? That would be very handy for people who
need an obfuscated sorting order.


- Cicero, Marcus Tullius: motto for the signature of Marcus Leisher's
e-mails ("Times are bad. Children no longer obey their parents, and
everyone is writing a book").

- Keinanen, Paul: authentic UTF-64 proposal (three 21-bit chars in a 64-bit

- Cimarosti, Marco (proposer) and Ewell, Doug (implementer): apocryphal
UTF-64 proposal (three 21-bit chars or nine 7-bit chars in a 64-bit word).

_ Marcus :-)

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT