By the way:
> In UTF-17, for example, the Han character sequence <U+5341, U+4E03>
> ('17'), would be converted to:
>
> <38 30 31 31 31 36 30 31 38 30 31 30 37 30 30 33>
Close, but not quite. Try:
<38 30 30 35 31 35 30 31 38 30 30 34 37 30 30 33>
> Because all UTF-17 bytes are in the range 0x30..0x38, this
> UTF-17 byte sequence would also be visible displayed in
> ASCII (or Latin-1) as: "8011160180107003".
"8005150180047003".
This is what you get for doing an implementation. But I did check my answers
against other octal conversion routines, hand calculator, Windows Calculator,
etc.
> Since all UTF-17 bytes display as digits, it is programmer
> friendly. All UTF-17 values will display visibly and correctly
> in any debugger, and the programmer need only recall that
> "80111601" means U+5341, for instance, to get back to the
> original Unicode character.
Ibid.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT