Re: Hypersurrogates

From: Benjamin M Scarborough (benjamin.scarborough@student.utdallas.edu)
Date: Sat Aug 29 2009 - 00:13:56 CDT

  • Next message: Doug Ewell: "Re: Hypersurrogates"

    Kenneth Whistler wrote:

    >There are such permanent rulings. Codes beyond U+10FFFF will
    >not be used in future versions of the Unicode Standard.

    And thus, it's now just as practical to talk about the structural
    U+7FFFFFFF maximum as it would be to talk about U+FFFFFFFFFFFFFFFF—they
    can both be mapped to UTF-8†, but the structure of UTF-16 is not going
    to change to acommodate anything past U+10FFFF.

    Sometimes I think people forget just how big a 1,114,112-code-point
    space really is.

    —Ben Scarborough

    †If my understanding of UTF-8 is correct, U+7FFFFFFF would be FD BF BF
    BF BF BF and U+FFFFFFFFFFFFFFFF would be FF BE 8F BF BF BF BF BF BF BF
    BF BF BF. 13 bytes, isn't that fun?



    This archive was generated by hypermail 2.1.5 : Sat Aug 29 2009 - 00:17:18 CDT