From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Tue May 13 2003 - 12:23:15 EDT
On Tue, 13 May 2003 08:48:42 -0600, John Jenkins wrote:
> > Our radical/stroke sort relies on the fact that unicode order is the
> > same as radical/stroke order.
>
> Actually, this is not quite true. Outside of the fact that the Han
> ideographs are spread out over three blocks, there are ambiguities in
> stroke-counting which can result in disagreements.
That's certainly true, but sorting by Unicode code point will be 90% OK for the
99.99% of CJK data that is encoded within the basic CJK block (and at the
radical level it'll probably be 99.9% OK). As a rough and ready method of
sorting CJK data it's definitely the most cost effective way of implementing a
CJK sort. Like I said, it all depends on what you want it for.
Andrew
This archive was generated by hypermail 2.1.5 : Tue May 13 2003 - 13:24:21 EDT