Re: Chinese Hemigram Analysis

From: Edward Cherlin (edward.cherlin.sy.67@aya.yale.edu)
Date: Mon Aug 30 1999 - 04:29:34 EDT


At 22:01 -0700 8/29/1999, Curtis Clark wrote:
>At 05:18 PM 8/29/99 -0700, Edward Cherlin wrote:
>>It was not clear then, and it is not clear now, how well algorithmic
>>rendering could be done on a sequence of hemigrams plus positioning data
>>and hints. It was also not clear how to carry out some parts of the
>>analysis, given the breadth of styles in the historic record and in
>current
>>use. Some of us would like bronze, seal, and oracle bone fonts in addition
>>to the usual brush styles. And then, what about "grass" calligraphy?
>
>Now wait a minute. We can't have a dotless j because it is supposed to be
>simple for software to read the decomposition and select a precomposed
>glyph from a font. Why can't that work for CJK? (I know some of the
>reasons, but it is still just the dotless j problem multiplied
>several-hundred-fold.)
>
>
>----------------------------------------------------------------
>Curtis Clark http://www.csupomona.edu/~jcclark/
>Biological Sciences Department Voice: (909) 869-4062
>California State Polytechnic University FAX: (909) 869-4078
>Pomona CA 91768-4032 USA jcclark@csupomona.edu

I would put the factor at much more than several hundred. The fact is that
glyphs in the same font which are agreed to represent the same character
can often be analyzed into quite different arrangements of hemigrams. Look
at the examples on page 19 of Ken Lunde's CJKV Information Processing. The
differences in hemigram composition between brush, seal, and bronze are
much greater. In cursive "grass" calligraphy, strokes are often merged or
vanish entirely.

--
Edward Cherlin   edward.cherlin.sy.67@aya.yale.edu
"It isn't what you don't know that hurts you, it's
what you know that ain't so."--Mark Twain, or else
some other prominent 19th century humorist and wit



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT