Taboo Variants (was Re: Digraphs as Distinct Logical Units )

From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Fri Aug 09 2002 - 08:45:20 EDT


"James Kass" wrote:

>
> Proposal to Add IDEOGRAPHIC TABOO VARIATION INDICATOR
> to ISO/IEC 10646:
> <a
href="http://mail.alumni.princeton.edu//jump/http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2475.pdf">http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2475.pdf>

Thanks for the reference.

There seem to be a couple of problems with this proposal as far as I can see.

1. The Ideographic Taboo Variation Indicator is proposed for inclusion in the Kangxi Radicals block
!!!

Surely they can't be serious. If they just need an empty code point, they might as well put it at
U+03A2 and be dammed. Probably the CJK Symbols and Punctuation block would be more appropriate, but
that's full up now, which I guess is why it's proposed to put the character at any old empty code
point. The original CJK Symbols and Punctuation block was always going to be too small, and I
believe that a new block is needed for extended CJK Symbols and Punctuation (there are still a
number of ideographic symbols that need encoding, such as the two or three commonly encountered
symbols that have the same semantics as U+3005 IDEOGRAPHIC ITERATION MARK).

2. Looking at CJK Unified Ideographs Extension B, it seems that the most common taboo variants are
now already encoded in Unicode. In addition to U+2239E and U+248E5 which I have already mentioned,
the primary example of a taboo-form variant character given in the proposal is also encoded at
U+22606. The secondary examples (where the taboo-form is used as a phonetic component in a more
complex character) could be currently coded using Ideographic Description Characters - e.g. <U+2FF0,
U+2E98, U+22606> and <U+2FF0, U+2EAF, U+22606>. Is there still a need for an Ideographic Taboo
Variation Indicator ?

Personally I still think that a separate CJK Taboo Replacement Characters block would have been more
logical ... but it's too late now.

By the way, when's Code2000 going to include the CJK Unified Ideographs Extension B glyphs ? There
are actually a few useful characters hidden here and there amongst the morass of junk characters.

Andrew West



This archive was generated by hypermail 2.1.2 : Fri Aug 09 2002 - 06:57:10 EDT