Re: UTF-17

From: DougEwell2@cs.com
Date: Sun Jun 24 2001 - 22:25:50 EDT


By the way:

> In UTF-17, for example, the Han character sequence <U+5341, U+4E03>
> ('17'), would be converted to:
>
> <38 30 31 31 31 36 30 31 38 30 31 30 37 30 30 33>

Close, but not quite. Try:

<38 30 30 35 31 35 30 31 38 30 30 34 37 30 30 33>

> Because all UTF-17 bytes are in the range 0x30..0x38, this
> UTF-17 byte sequence would also be visible displayed in
> ASCII (or Latin-1) as: "8011160180107003".

"8005150180047003".

This is what you get for doing an implementation. But I did check my answers
against other octal conversion routines, hand calculator, Windows Calculator,
etc.

> Since all UTF-17 bytes display as digits, it is programmer
> friendly. All UTF-17 values will display visibly and correctly
> in any debugger, and the programmer need only recall that
> "80111601" means U+5341, for instance, to get back to the
> original Unicode character.

Ibid.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT