RE: Speaking of Plane 1 characters...

From: Dominikus Scherkl (
Date: Tue Nov 12 2002 - 07:38:29 EST

  • Next message: Michael Everson: "N2515: Request for Roadmap - plane 3"


    For those of you who _are_ programmers (or at least
    know a little C), there is a somewhat easier formaula
    to convert between utf16 and utf32 for plane1 and above
    (the offset 0x10000 in the high surraogate can be fix
    shifted and included in the constant term):

    utf16high = 0xD7C0 + (utf32 >> 10);
    utf16low = 0xDC00 + (utf32 & 1023);

    this is very easy to invert:

    utf32 = ((utf16high - 0xD7C0) <<10) + (utf16low & 1023);

    Here utf16high and utf16low are 16bit-surrogates, and
    utf32 is of course a 32bit-value.
    The bitshift operators >> and << can be replaced
    by ordinary division or multiplication by twopowers and
    the bitwise-and & is equivalent to a modulo-operation.
    But that is slower (relevant only for realy high-speed
    converters ;-).

    Best regards.

    Dominikus Scherkl

    This archive was generated by hypermail 2.1.5 : Tue Nov 12 2002 - 08:23:06 EST