Re: Surrogate points

From: Hans Aberg (haberg@math.su.se)
Date: Wed Jan 26 2005 - 11:48:57 CST

  • Next message: Rick McGowan: "Re: Surrogate points"

    At 16:56 -0800 2005/01/25, Markus Scherer wrote:
    >> Should not the in effect empty Unicode points, U+D800 to U+DFFF, as well as
    >> U+FFFE and U+FFFF, be filled with characters? The current construction gives
    >> a misleading impression that the Unicode character set and character
    >> numbering have anything to do with the encoding UTF-16.
    >
    >You are trying to rewrite history... these code points are designated as they
    >are, and changing them would wreak havoc.

    The current design evidently already causes havoc.

    >> One might also design a new set of encodings for k-bit words, ...
    >
    >With no advantage over what's available, ...

    The advantage is a general, clean, non-confusing, design.

    >...and widely implemented, I don't see this getting anywhere.

    There is no hurt introducing a new, superior format, as it does not
    annihilate the old one.

    >> Call, ad hoc, this encoding UE-k. Then UE-16 has the capacity of holding 27
    >> bits in a two-word. UE-8 is the same as UTF-8. And UTF-32 is the same as
    >> UE-32.
    >
    >No one needs, or wants, more than 20.1 bits for Unicode code points.

    Please do not tell others what they may need or want. And the propsal might
    in fact decrease the number of Unicode code points. So jumping to
    conclusions is not very wise that either.

      Hans Aberg



    This archive was generated by hypermail 2.1.5 : Wed Jan 26 2005 - 12:42:55 CST