On Tue, 6 Nov 2001, John H. Jenkins wrote:
>
> On Tuesday, November 6, 2001, at 09:41 AM, Michael Everson wrote:
>
> > John Jenkins said:
> >
> >> Ah, but you should never underestimate the power of the Force. Remember
> >> that the IRG is *already* looking to adding some 60,000 ideographs for
> >> Extension C.
> >
> > Where do they find these things?
> The overwhelming majority of them are coming from medieval Korean Buddhist
> documents.
I'm dumbstruck to hear that. Sixty thousand ideographs from
medieval Korean Buddhist documents !! Oh, my gosh. Do they really have
60,000 Chinese characters not yet encoded? I wondered IRG has given
up unification. I know what these documents are (八萬大藏經 or
高麗大藏經 ). At first I thought you may have added extra '0',
but the position of comma suggested that you couldn't have.
I went to http://www.sutra.re.kr (http://www.sutra.re.kr/english )
and they indeed have the detailed statistics of variants (the number
of occurrences and relative frequencey, etc). It's available at
<http://211.46.71.249/handic/index.htm>. The default indexing method
is based on radicals. Choose a radical in the bottom left panel and
the stroke count in the bottom right panel and the list of characters
with the radical and stroke count of your choice will show up below the
stroke count list. If you click on a character, the top panel will give
you details on it (Unicode code point + 'variant index'??,
pronunciations in Korean, Chinese
and Japanese, etc) At the very end of the top panel, you'll find an icon
with TV-like symbol at the right end (The Korean string to its left is
'정체,이체자 정보' meaning variant info.). Clicking on it will
open up a new window with detailed statistics I mentioned above. I think
this is a pretty nice source of Chinese character variants along with
much more comprehensive Chinese character variant dictionary in Taiwan
(available at http://140.111.1.40)
They also have the overall statistics of Chinese characters found
in the documents at <http://211.46.71.249/charstatistics>. The more
detailed statistics with frequency rank is at
<http://211.46.71.249/charstatistics/freqrank.htm>.
Jungshik Shin
This archive was generated by hypermail 2.1.2 : Fri Nov 09 2001 - 14:04:43 EST