In the on-line UniHan database (http://www.unicode.org/charts/unihan.html) I
see a field that I have never seen before:
"- Other useful dictionary-like data
- [...]
- A phonetic grouping for the character"
The phonetic grouping seems to be an integer number, and I wonder:
- What does this information mean?
- Why some characters don't have it? Is it just missing or it does not apply
to them?
- Where does it come from? I have not seen a corresponding field in the
plain-text file UniHan.txt.
Thanks in advance.
_ Marco
P.S.: I take the occasion to congratulate the author(s) of the on-line
UniHan for all the recent improvements, especially the addition of the
Chinese and Japanese compounds words.
I also take the occasion to suggest a new field that could be very useful:
the frequency of usage of each character. This information may be derived
from good on-line sources. E.g., for Chinese, from Chi-Ho Tsai's research
(http://www.geocities.com/hao510/charfreq/) and, for Japanese, from the
KanjiDic database, (http://www.csse.monash.edu.au/~jwb/kanjidic_doc.html).
(I don't know the licensing terms for using these data.)
_ M.
This archive was generated by hypermail 2.1.2 : Mon Feb 04 2002 - 09:02:33 EST