From: Dan Kogai (dankogai@dan.co.jp)
Date: Tue May 13 2003 - 15:16:52 EDT
On Tuesday, May 13, 2003, at 11:48 PM, John Jenkins wrote:
>> Stroke order, then, is something
>> different. Seems like we would need order entries in the config
>> data
>> for every character, which would be totally unmanageable.
>>
>> I didn't have any luck searching the Unicode web site for information
>> about sorting by stroke.
>>
>
> There is a kTotalStrokes field in Unihan.txt, although it doesn't
> cover every character in Unihan. This would definitely be a good
> place to start.
If you are using Perl 5.6.0 or higher (5.8.0 recommended), you can use
Unicode::Unihan module available via CPAN. Let me show you a small
example.
#!/usr/local/bin/perl
use strict;
use Unicode::Unihan;
my $uh = Unicode::Unihan->new;
my $str = "\x{5c0f}\x{98fc}\x{5f3e}"; # my name in Kanji
my @chars = map {chr($_)} unpack("U*" => $str);
my @strokes = $uh->TotalStrokes($str);
my %c2s; @c2s{@chars} = @strokes;
binmode STDOUT => ':utf8';
for my $char (sort {$c2s{$a} <=> $c2s{$b} || $a cmp $b} @chars){
print "$char => $c2s{$char}\n";
}
__END__
And here is what it prints.
$B>.(B => 3
$BCF(B => 12
$B;t(B => 14
I am not sure if Unicode::Unihan is robust enough for the practical use
but IMHO it is a handy place to start.
Dan the Perl5 Porter
This archive was generated by hypermail 2.1.5 : Tue May 13 2003 - 16:21:58 EDT