RE: Variant Glyph Display

From: Chris Pratley (chrispr@exchange.microsoft.com)
Date: Thu Apr 10 2003 - 15:12:33 EDT

  • Next message: jameskass@att.net: "Re: Variant Glyph Display"

    Part of the PUA (starting at the bottom) is used by Word to handle EUDC
    characters from standards such as Big5, Shift-JIS etc when data from
    those charsets are mapped to Unicode.

    Word uses a portion of the PUA (actually Windows does this mapping), to
    hold this data. Since those EUDC characters are expecting Asian layout
    characteristics, font binding, etc, we internally make some assumptions
    about that part of the PUA (realistically this is by far the most common
    usage of that region). Also, if we don't make those assumptions, then
    plain-text "EUDC" text copied from say notepad to Word as Unicode text
    would not pick up the required Asian properties. Lots of existing apps
    place only plain text on the clipboard (Unicode or local code page such
    as Big5), so we have to assume the Asian properties or things don't work
    for those EUDC characters.

    There are other parts of the PUA that are not bound this way in Word
    that can be used for whatever purpose (the higher ranges primarily).
    We've also been asked by some users to assign part of the PUA as
    "assumed Complex" or "assumed bi-di" to help out with minority scripts
    with those properties. Hasn't happened yet. It's obviously not a perfect
    solution, but the PUA is not perfect by design - it was meant as a way
    to handle this sort of uncategorized stuff, but never expected to handle
    it gracefully until these characters get properly encoded.

    Chris

    -----Original Message-----
    From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
    Behalf Of Rick McGowan
    Sent: Thursday, April 10, 2003 9:27 AM
    To: unicode@unicode.org
    Subject: Re: Variant Glyph Display

    By the way, in looking at the web site James Kass mentioned, I find:

    > Note: I would have preferred to place these characters in the Private
    Zone
    > in the F0000 range, but Microsoft Word recognises characters in this
    zone
    > as Chinese, and changes fonts automatically, messing up the text. A
    major
    > hassle. Hopefully, Microsoft will catch on to the current Unicode
    > encoding, and it won't be a problem anymore to use the Private Zone.

    Can anyone comment on the accuracy/applicability of the above quote? Is

    this only the case with older Word versions? Or what?

    Thanks,
            Rick



    This archive was generated by hypermail 2.1.5 : Thu Apr 10 2003 - 16:03:56 EDT