John Cowan wrote:
> James E. Agenbroad wrote:
> > 00B1 0000 0000 1011 0001 1100 0010 1011 0001 C2B1
> 1100 0000 1011 0001 C0B1
James' value C2,B1 was correct, and it can be obtained even using your own
table:
> U+0080 to U+07FF: 110- ---- 10-- ----
1100 0010 1011 0001
Probably, the confusion was caused by that 1011 nibble, recurring twice in
the same position -- by pure coincidence.
By the way, thank you for this manual method: I will print and store it in
my wallet near my blood group.
By the way #2, have you notice that the least-significant 4 bits in UTF-8
are always the same as the scalar corresponding value? As these 4 bits
correspond exactly to the rightmost hex digit, we could simply ignore the
last digit.
This allows building a table of scalar value to UTF-8 values containing only
4096 entries (well, provided we ignore all the "new" code points U-00010000
to U-0010FFFF). E.g.:
...
U+00B? = C2,B?
...
U+266? = E2,99,A?
...
Such a list could be easily be folded in one's wallet, in order to be able
to easily calculate UTF-8 conversion also when on a desert island with no
computers.
Ciao. Marco
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT