Re: Unicode to UTF-8

From: John Cowan (jcowan@reutershealth.com)
Date: Wed Mar 15 2000 - 13:50:50 EST


"James E. Agenbroad" wrote:

> Unicode UTF-8 Name
> hex = binary binary = hex
>
> 00B1 0000 0000 1011 0001 1100 0010 1011 0001 C2B1 Plus/minus
                             1100 0000 1011 0001 C0B1
>
> 26D6 0010 0110 0110 1101 1110 0010 1001 1001 1010 1101 E299AD Flat
Correct.

The method for manual conversion of Unicode scalar values to UTF-8 is to
choose the appropriate template from this table:

U+0000 to U+007F: 0--- ----
U+0080 to U+07FF: 110- ---- 10-- ----
U+0800 to U+FFFF: 1110 ---- 10-- ---- 10-- ----
All others: 1111 0--- 10-- ---- 10-- ---- 10-- ----

Then replace the hyphens with the bits of the Unicode scalar value,
*working from right to left*. Stop when all hyphens are gone.
In handwriting, it is more readable to use LOW LINEs than hyphens.

-- 

Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT