From: Mike (mike-list@pobox.com)
Date: Sun Jan 21 2007 - 17:26:12 CST
> 1x 1y 0z => 21 bits (for 1x different from 1000 0000)
> 1x 0y => 14 bits (for 1x different from 1000 0000)
> 0x => 7 bits (the ASCII range)
Now this proposal has some good things going for it.
First it's simple. Second it matches UTF-8's length
of 1 for ASCII characters, but then it exceeds UTF-8
by allowing representation of up to U+3FFF with just
2 bytes, and all of Unicode in at most 3 bytes. The
only place that UTF-16 wins is in the range U+4000
to U+FFFF (2 bytes instead of 3), but it loses for
all other characters.
Mike
This archive was generated by hypermail 2.1.5 : Sun Jan 21 2007 - 17:26:21 CST