From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Apr 14 2004 - 17:49:43 EDT
Gary asked:
> Thanks, Frank. I hadn't found the grid index and it looks
> worth remembering. But now I'm beginning to think I didn't
> ask the right question.
>
> Judging by what we saw in the back of the Unicode 2.0 book,
> we would tend to say that it is correct that (in an index)
> 21333 (0x5355) is sorting under 21313 (0x5341) instead of
> 20843 (0x516b).
That is an incorrect assumption. Dictionaries (and indexes to
character lists in general) make different assignments to
radicals in these marginal cases. There is no one right
answer for every circumstance. Certainly, you cannot take
the Unicode *2.0* radical/stroke index as definitive for
anything.
If you look at the Unicode *4.0* radical/stroke index, you will
find that U+5355 is listed *twice* in the index, once under
the U+5341 radical and once under the U+516B radical,
precisely to make it easier for people to find the
character in question, no matter what their assumption might
be regarding what part represents "the" radical for the
character. This is a not uncommon situation, particularly
for simplified characters which don't always have obvious
radicals. Note that the traditional form of this character,
U+55AE, is listed under the 'mouth' radical, U+53E3, and
not under U+5341 or U+516B.
> I am looking for some table of radicals
> that I can show our customer to help support that claim.
I think that rather than arguing with the customer on the
basis of an old radical/stroke index in Unicode 2.0, your
best course might simply be to provide the customer with
the behavior they desire. :-)
> Perhaps I should start by asking for opinions on the above
> sorting, and for guidelines on how best to govern such
> decisions,
Please your customer.
--Ken
> though I'll admit I know less than the development
> engineer involved, and so may be asking a less educated
> question than we would have.
This archive was generated by hypermail 2.1.5 : Wed Apr 14 2004 - 18:32:35 EDT