From: Doug Ewell (dewell@adelphia.net)
Date: Sat Oct 12 2002 - 16:57:30 EDT
Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
>> IBM has a Web page containing many PDF charts of code pages, and they
>> have the same problem: some show one GCGID for U+03B8, others show
the
>> other one.
>
> Wouldn't you be able to tell by the shape associated with the GCGID?
There were no shapes to look at. The tables in the Unicode 1.0 book,
and tables-in-electronic-form associated with Unicode 1.1, identified
the Unicode character U+03B8 with GT610000 in the mapping tables for
MS-DOS code page 869 and EBCDIC code page 875, and with GT610002 for
Windows code page 1253 and various East Asian DBCS code pages.
In both the Unicode 1.0 and 3.0 books, U+03B8 is represented with the
"straight" theta glyph, while U+03D1 (not listed in any of the Unicode
1.x tables) is represented with the "loopy" glyph.
Markus's answer seems to indicate that GT61 is what really identifies
the Greek lower-case theta. The "0001" suffix specifically calls for
the loopy glyph and "0002" calls for the straight glyph, while "0000" is
a generic suffix (exact glyph unspecified). But as I wrote in a
separate message to Markus, it gets worse; there are other Unicode
characters (mainly symbols) for which two or more *very* different
GCGIDs are listed, depending on which reference source you use.
It seems that GCGIDs predate any formal distinction between character
and glyph of the type adopted by Unicode, making it somewhere between
difficult and impossible to create a 1-to-1 mapping table between GCGIDs
and Unicode
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.5 : Sat Oct 12 2002 - 17:36:32 EDT