Re: Proposing UTF-21/24

From: Mike (mike-list@pobox.com)
Date: Sun Jan 21 2007 - 17:26:12 CST

  • Next message: Mark Davis: "Re: Regulating PUA."

    > 1x 1y 0z => 21 bits (for 1x different from 1000 0000)
    > 1x 0y => 14 bits (for 1x different from 1000 0000)
    > 0x => 7 bits (the ASCII range)

    Now this proposal has some good things going for it.
    First it's simple. Second it matches UTF-8's length
    of 1 for ASCII characters, but then it exceeds UTF-8
    by allowing representation of up to U+3FFF with just
    2 bytes, and all of Unicode in at most 3 bytes. The
    only place that UTF-16 wins is in the range U+4000
    to U+FFFF (2 bytes instead of 3), but it loses for
    all other characters.

    Mike



    This archive was generated by hypermail 2.1.5 : Sun Jan 21 2007 - 17:26:21 CST