Re: "Giga Character Set": Nothing but noise

From: Jon Babcock (jon@kanji.com)
Date: Wed Oct 18 2000 - 13:28:47 EDT


>>>>> "Marco" == Marco Cimarosti <Marco.Cimarosti@icl.com> writes:

> Jon Babcock wrote:
>> It seems to me that if not for that, how could anyone make a
>> Chinese font? Who is going to sit down and draw a *myriad* or
>> more characters? Since elements recur, this reduces the amount
>> of labour required greatly.

> I too would have bet that all CJK foundries used some form of
> (automatic?) composition to build their fonts.

> But, after a few enquiries, it seem that they don't (or they do,
> but zealously guard the secret).

My guess is that the woodblock carvers had the Chinese hemigrams in
mind, as well as the tiny set of individual strokes (9 - 20 ?), as
they carved the thousands upon thousands of woodblocks that were used
in early Chinese printing. A study of the Chinese woodblock carver's
craft would be very interesting in this regard, but probably the
tradition is already dead, and there is no one left to talk to.

The fact is, in the Chinese script, expect for the 300 or so wen
(holograms), any one of the remaining > 50,000 zi (digrams) can be
divided in half, and the number of halves (hemigrams) is limited to a
small fraction of the total, perhaps around 2,000. For a 'character'
set, except in a handful of cases, the graphic details of the
juxtaposition or even the individual composition (by strokes) of the
two hemigrams is not required. Those would be strictly font/glyph
rendering issues.

Take, for example, the Chinese character, 'carblapidary'. It would be
enough to know that the character consisted of 'carb' and
'lapidary'. The fact that it may be rendered differently in mainland
China, Taiwan, and Japan, and in different ways at different times,
need not concern the (ideal) Han character set.

('carb', here, is a semantic notation for the hemigram 70ad, Mandarin
tan4, charcoal, and 'lapidary' stands for Kangxi classifier (radical)
#112, Mandarin shi2, which like all the other classifiers, has been
given two code points in Unicode, in this case 77ef and 2f6f.)

BTW, Marco, as near as I can recall, the above quotation in not from
me.

Jon

-- 
Jon Babcock <jon@kanji.com>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT