Re: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Mar 15 2002 - 17:27:43 EST

Previous message: Markus Scherer: "Re: Collation - last character?"
Maybe in reply to: Dan Kogai: "Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Next in thread: Dan Kogai: "Re: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Reply: Dan Kogai: "Re: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Reply: James E. Agenbroad: "Re: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Dan Kogai continued:

> For instance,
>
> http://www.horagai.com/www/moji/int/kasiwa.htm
>
> reports that in Kashiwa, Chiba, a typical suburban city with
> population about 210,000, some 21,587 people needed character that was
> not listed in JIS.

This long interview seems to be about, among other things, the city
shifting over to a new IT system, and there is a bunch of discussion
about how to handle the gaiji from the old system in the new system.
I don't see any evidence here that there is some character in question
that isn't handled by Unicode.

>
> > In any case, a casual browsing around on the "Moji" site doesn't turn up
> > any obvious catalogue of "known characters in Japanese" required
> > for such things as the spellings of "Watanabe" but which are
> > not present in the Unicode Standard. Instead, there is just a
> > lot of general anti-Unicode grousing.
>
> His example does not use (in)famous Watanabe, but
>
> http://www.horagai.com/www/moji/show.htm
>
> is a great example of the problem that Kanji Unification has caused.
> This one still holds true.

*What* still holds true? These are just well-worn issues of itaiji
(variant forms). The characters from the little anime exhibit of
variants are, in Unicode:

U+9AD8 / U+9AD9
U+5516 / U+555E
U+9593 / U+9592

all variants of the same character that got cloned into Unicode
because of the source separation rule.

And the last one is U+5409 "kichi". For this one, I believe the
variant is simply a zokuji ("vulgar variant") not recognized as
standard in the dictionaries. But it is just one of thousands
of similar variant forms which could be attested for itaiji.

The whole issue of Han variant forms, by the way, is not something
that the Unicode Standard created, nor did Han encoding unification
principles in Unicode and 10646 somehow exacerbate the problem for
IT processing.

> And don't forget the fact he grouse even more on JIS than Unicode.

True enough.

> To
> me all he wants is a decent character set that spells names right.

But of course that begs the question of what presentation variation
detail he or other users perceive to be spelling differences. Correct
presentation of all details of Han characters may not *be* the
business of the character encoding per se. There is an architectural
decision to be made regarding the tradeoff between the identity of
characters for processing purposes and the appearance of characters
for rendering purposes, and Kato-san and the IRG appear to disagree
about where that line should be drawn.

> His
> favorite appears to be ISO-2022 but as Yet Another Perl Encoding Hacker,
> ISO-2022 is pain in the arse....

You got that right!

--Ken

>
> U+5F48 or U+5F3E
>
>

Previous message: Markus Scherer: "Re: Collation - last character?"
Maybe in reply to: Dan Kogai: "Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Next in thread: Dan Kogai: "Re: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Reply: Dan Kogai: "Re: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Reply: James E. Agenbroad: "Re: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Mar 15 2002 - 16:49:23 EST