From: John Cowan (cowan@mercury.ccil.org)
Date: Mon Apr 21 2003 - 10:02:31 EDT
Elliotte Rusty Harold scripsit:
> Is there anywhere I can find or piece together a *complete* list of
> Unicode characters that are available in Big5 (and other similar
> sets)? I've looked at unihan.txt, and it has part of what I need but
> not all of it. It specifies which Unicode Han characters are
> available in which other character sets. However, most of these
> character sets include various ASCII characters, Greek letters,
> symbols, digits, and so forth. These do not appear to be listed in
> unihan.txt.
http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT .
Don't be afraid of the "OBSOLETE" in the URL, which just means that
these are not officially maintained by Unicode any more. In general,
http://www.unicode.org/Public/MAPPINGS is a good source of mappings.
http://crl.nmsu.edu/~mleisher/csets.html is another high-quality set,
not overlapping with the Unicode one. Both sites use the same simple
tabular format.
You will probably get complaints that your Big5 mapping is not complete.
This is because there is no standard way to map Big5 extensions to
Unicode -- indeed, the various vendors do not even agree on what the
extensions are.
I look forward to the excellent small and fast code you will design
for representing the list of valid characters in Big5 and other large
character sets....
-- Deshil Holles eamus. Deshil Holles eamus. Deshil Holles eamus. Send us, bright one, light one, Horhorn, quickening, and wombfruit. (3x) Hoopsa, boyaboy, hoopsa! Hoopsa, boyaboy, hoopsa! Hoopsa, boyaboy, hoopsa! -- Joyce, _Ulysses_, "Oxen of the Sun" jcowan@reutershealth.com
This archive was generated by hypermail 2.1.5 : Mon Apr 21 2003 - 10:38:17 EDT