Kenneth Whistler wrote:
> Someday I'll write myself a little command line convertor for this --
> I spend way too much time hand converting these little examples
> back and forth!
Oh, very well, here it is:
---cut here---
#!/usr/bin/perl
# This silly script examines its first argument.
# It converts a U+xxxx or U-xxxxxxxx string into UTF-8.
# If the argument doesn't look like that, it's assumed
# to be UTF-8 already, and is converted to UTF-16 and UTF-32 instead.
# No significant error checking; do not use in production.
#
# John Cowan (cowan@ccil.org) wrote this because Ken Whistler and I got
# tired of doing the job by hand all the time.
# No copyright, no warranty, use as you will.
unless (($_) = @ARGV) {
die "usage: utf (U+xxxx | U-xxxxxxxx | xxxx...)\n";
}
if (/^U\+(....)$/) {
$v = hex($1);
if ($v < 0x80) {
printf "%-2.2X\n", $v;
}
elsif ($v < 0x7ff) {
$lead = 0xc0 + (($v >> 6) & 0x1f);
$t1 = 0x80 + ($v & 0x3f);
printf "%-2.2X %-2.2X\n", $lead, $t1;
}
else {
$lead = 0xe0 + (($v >> 12) & 0xf);
$t1 = 0x80 + (($v >> 6) & 0x3f);
$t2 = 0x80 + ($v & 0x3f);
printf "%-2.2X %-2.2X %-2.2X\n", $lead, $t1, $t2;
}
}
elsif (/^U-(........)$/) {
$v = hex($1);
$lead = 0xf0 + (($v >> 18) & 0x7);
$t1 = 0x80 + (($v >> 12) & 0x3f);
$t2 = 0x80 + (($v >> 6) & 0x3f);
$t3 = 0x80 + ($v & 0x3f);
printf "%-2.2X %-2.2X %-2.2X %-2.2X\n", $lead, $t1, $t2, $t3;
}
else {
if (/^(..)$/) {
$lead = hex($1);
printf "U+%-4.4X\n", $lead;
}
elsif (/^(..)(..)$/) {
$lead = hex($1);
$t1 = hex($2);
printf "U+%-4.4X\n", (($lead & 0x1f) << 6) + ($t1 & 0x3f);
}
elsif (/^(..)(..)(..)$/) {
$lead = hex($1);
$t1 = hex($2);
$t2 = hex($3);
printf "U+%-4.4X\n", (($lead & 0xf) << 12) +
(($t1 & 0x3f) << 6) + ($t2 & 0x3f);
}
elsif (/^(..)(..)(..)(..)$/) {
$lead = hex($1);
$t1 = hex($2);
$t2 = hex($3);
$t3 = hex($4);
$v = (($lead & 0x3) << 18) + (($t1 & 0x3f) << 12) +
(($t2 & 0x3f) << 6) + ($t3 & 0x3f);
$s1 = 0xd800 + ((($v - 0x10000) >> 10) & 0x3ff);
$s2 = 0xdc00 + ($v & 0x3ff);
printf "U+%-4.4X U+%-4.4X\n", $s1, $s2;
printf "U-%-8.8X\n", $v;
}
else {
die "eh?\n";
}
}
---cut here---
--Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT