From: Arne Götje (高盛華) (arne@linux.org.tw)
Date: Sun Feb 04 2007 - 05:13:30 CST
On Saturday 27 January 2007 13:35, John H. Jenkins wrote:
> I would love to see them, too, and will gladly add them to Unicode's
> database of known unencoded ideographs (provided we get reasonable
> pointers to documentation as well).
>
> Unfortunately, the ship has sailed on Extension D. Actual proposals
> to encode these will have to wait for Extension E.
Ok, I have scanned the list.
The pdf is here:
http://debian.linux.org.tw/~arne/MinNan_IM/Minnan_missing001.pdf
I also composed a list of all missing characters (the invented ones and
others from the same dictionary) with ideographic description
sequences.
The list is here:
http://debian.linux.org.tw/~arne/MinNan_IM/missing.txt
At least I couldn't find those characters in Unicode... maybe I have
overlooked a few...
which brings me to another question:
Does anyone have / know a tool where I can search CJK characters in
Unicode based on the components they are made of?
Im particularly intersted in Ext.B characters, because it's a PITA to
scan the PDF manually. The Radical/Stroke search on the Unicode webpage
is not always a big help, since it is not always clear to which radical
a character belongs, expecially in Ext.B... :(
So, I'm looking for something like this:
I want to get the codepoint of the character 𣍐.
I search for the components 勿 and 會. Then the character 𣍐 should be
displayed with its codepoint U+23350.
If this kind of database doesn't exist yet, who is with me to create
one?
For the references of the above mentioned missing characters, I would
need some time to collect them... I guess a scan of the dictionary page
in question is not sufficient, is it?
(I also have an additional list of missing charcaters from a Hakka
dictionary... but unfortunately I need to dig out the characters from
the dictionary myself, the author didn't provide me a list of them...
so it will take some time until the list is complete.)
Cheers
Arne
-- Arne Götje (高盛華) <arne@linux.org.tw> PGP/GnuPG key: 1024D/685D1E8C Fingerprint: 2056 F6B7 DEA8 B478 311F 1C34 6E9F D06E 685D 1E8C Key available at wwwkeys.pgp.net. Encrypted e-mail preferred.
This archive was generated by hypermail 2.1.5 : Sun Feb 04 2007 - 05:17:31 CST