On Friday, August 9, 2002, at 11:38 AM, Andrew C. West wrote:
> My point is that if the commonly encountered taboo variants are
> already encoded in CJK-B, then
> either the other taboo variants should also be added to CJK-B or they
> could be *described* using
> IDCs.
Encoding them was a mistake, pure and simple. We didn't monitor the
IRG well enough in the CJK-B encoding process, or we would have
objected to this kind of cruft.
And describing them is a valid approach. It depends on what's more
important to you—the appearance (which IDS's are better at), or the
semantic (which is explicit with the TVS).
> Adding a taboo variant selector does make a difference, because then
> there'll be more than one
> way to reference the same character.
>
Well, yes and no. Even though we've already got taboo variants
encoded, we have no way to flag in a text that the purpose they're
serving is taboo variants. The interesting thing about the taboo
variants is precisely that meaning: This is character X written in a
deliberately distorted way. You identified the taboo variants you
found in Ext B not based on anything in the standard, but because of
your outside knowledge. A student encountering them in a text may well
be stymied until she goes to her professor.
Meanwhile, multiple encodings of the same Han character are *already* a
major problem. This is one reason why the UTC is determined to be
stricter in the future to keep it from continuing to happen.
==========
John H. Jenkins
jenkins@apple.com
jhjenkins@mac.com
http://homepage.mac.com/jhjenkins/
This archive was generated by hypermail 2.1.2 : Fri Aug 09 2002 - 12:13:19 EDT