Re: Unicode, Cure-all or Kill-all?

From: Timothy Huang (timd_huang@mail.formac.com.tw)
Date: Sat Aug 10 1996 - 20:48:49 EDT


Dear Werner,

Werner LEMBERG wrote:
>
> On Fri, 9 Aug 1996 unicode@Unicode.ORG wrote:
>
> > On the number game, of course, 1,114,112 is greater than 75,684. But, as
> > now and up to now, 75,684 is greater than 20,902. And this is the
> > central point of problem. When the Unicoder Savior will let the Chinese
> > people have 'enough' characters to use? and when the system software
> > companies will implement that? Mr. Godot, we are waiting for t-o-o
> > l-o-n-g already. Yes, some chemists are not happy with the GB-2312 and
> > Big-5, both were copy versions of JIS.
>
> Do you have any better solution?
>
> > How many computer systems in Taiwan followed the Taiwan
> > National Standard CNS 11643?
>
> That's a good question. I'm quite curious where I can find software
> (expect my own CJK package for LaTeX2e + Mule) which supports CNS.
>
> > what's wrong with the CCCII, so I can learn. Seriously, if you find
> > something wrong with CCCII,...
>
> Christian Wittern <cwittern@central.conline.de>, a former employee of the
> IRIZ in Kyoto, writes in the June 1995 release of the Electronic
> Bodhidharma:
>
> "...there are many cases where CCCII has more than one code point for the
> same character. When encountering such multidefined characters, the user
> has to decide which code point to use. Since these codepoints have
> different semantics, this is a quite impossible task for most input
> operators...The relationship of orthographic characters and variant forms
> is very complex and can not be expressed in a fixed, one-dimensional,
> hard-wried codetable...
>
> ...the character glyphs are neither well defined nor consistent..."
>
> Wittern further writes that CCCII is under revision. I would be glad to
> hear some details.
>
> Werner

Yes, there is a (at least) better solution -- ANSI Z39.64 EACC/CCCII.
It's been used by the libraries all over the world.

There is NONE system using CNS11643, including the organization who made
(III) and published (CNS) the standard. That standard is for fool to
believe.

Regarding to "more than one code point for the same character": This is
a misunderstanding. There are some Chinese characters have the "same"
shape -- they look like the same character, but in fact they are NOT.
Example: the character Tai2 (a triangle on top of a square). This can be
the simplified form of Tai(wan), or the variant of Typhoon, AND also as
an orthographic in Sir. You see the meaning of such a character is very
important. When the Chinese Character Analysis Group (CCAG) did the
complilation, they had several top notch Chinese scholars to review the
characters. The key point I want to say is this, to the Western eyes,
the shape of a given Chinese character decide everything. However, in
reality, that is not always right. There are many situations where a
variant is an orthographic of another character, and vise versa. I
don't think Mr. Christian Wittern understand this point.

The relationship between orthographic and variant is indeed very
(extremely) complicate. However, I think the CCAG did a very good job.
Limited by the coding structure, the what you called "a fixed,
one-dimentional, hard-wired codetable" is the only way of showing their
relationship. Actually, it is not "one-dimentional", but
three-dimentional. Further relationship expression must be and can be
done by accompany Chinese Character Data Base which should be inside the
Chinese Operating System. Speaking of this "wrong", let me give you
another point of view: I would like to code the characters according to
the reading (phonetic). But, is any Chinese coding done by that? NO, Why
not? Because the phonetic part of the Chinese language is even more
complicated than the orthographic/variant. One other thing, please
remember: The EACC/CCCII is the only coding scheme shows the
relationship between the characters, no others dare or even both to deal
with this nature of the Chinese characters. And that show how well the
Chinese scholars (if any) of their camp did the job -- they did not even
have the gut or don't have the sense to face this issue.

Yes, the CCCII is under revision. 75,684 characters were released in
1989. However, due to all kind of reasons, the revision is very slow
coming. Mainly, no body is paying for the bill, and Professor Chang is
very sick now. Very little works are undergoing by some individuals in
their spare time. I am one of them. Does anyone out there know some rich
foundations would like to sponsor and carry on this great project?

Smiles,
Timothy Huang



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT