"Phonetic grouping" in UniHan

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Mon Feb 04 2002 - 09:21:59 EST


In the on-line UniHan database (http://www.unicode.org/charts/unihan.html) I
see a field that I have never seen before:

        "- Other useful dictionary-like data
                - [...]
                - A phonetic grouping for the character"

The phonetic grouping seems to be an integer number, and I wonder:

- What does this information mean?

- Why some characters don't have it? Is it just missing or it does not apply
to them?

- Where does it come from? I have not seen a corresponding field in the
plain-text file UniHan.txt.

Thanks in advance.
_ Marco

P.S.: I take the occasion to congratulate the author(s) of the on-line
UniHan for all the recent improvements, especially the addition of the
Chinese and Japanese compounds words.

I also take the occasion to suggest a new field that could be very useful:
the frequency of usage of each character. This information may be derived
from good on-line sources. E.g., for Chinese, from Chi-Ho Tsai's research
(http://www.geocities.com/hao510/charfreq/) and, for Japanese, from the
KanjiDic database, (http://www.csse.monash.edu.au/~jwb/kanjidic_doc.html).
(I don't know the licensing terms for using these data.)

_ M.



This archive was generated by hypermail 2.1.2 : Mon Feb 04 2002 - 09:02:33 EST