From: John H. Jenkins (
Date: Mon May 13 2002 - 11:21:45 EDT

On Monday, May 13, 2002, at 04:21 AM, William Overington wrote:

> I have been looking at the characters in the CJK Unified Ideographs
> Extension B document. These are the characters from U+020000 through to
> U+02A6DF, which, as I understand it, are the rarer CJK characters.

Actually, this is not quite true. The vast majority are rare, of course,
and none of them are exactly *common*, but how rare they are depends on
what you're writing. A small number, for example, are from HK SCS and
reflect current needs for Hong Kong, including general-purpose Cantonese
writing. (One is generally not supposed to write Cantonese, even if one
speaks it, hence the lag in getting some Cantonese-specific characters

> I wonder if any of the people who read this list who understand the
> languages involved might please like to say what any one or two of these
> characters, of their choice, mean please, just as a matter of general
> cultural interest for people who see these characters in the Unicode
> specification and, though not themselves knowledgeable of the languages,
> find the characters interesting for their artistry and history.

My personal favorite is U+233B4, which means a tree stump. (It's formed
by taking the "tree" radical and moving the cross-bar to the top of the
character instead of having it in the middle.) U+20C43 is a
Cantonese-specific character meaning thin or flat.

Altogether, currently eighteen characters from Extension B currently have
a kDefinition entry in Unihan.txt.

John H. Jenkins

