From: Dan Kogai (dankogai@dan.co.jp)
Date: Tue Mar 04 2003 - 14:25:08 EST
On Tuesday, Mar 4, 2003, at 07:59 Asia/Tokyo, David Oftedal wrote:
> Hello!
>
> Sorry to make this a mass spam, but I need a program to convert UTF-8
> to hex sequences. This is useful for embedding text in non-UTF web
> pages, but also for creating a Yudit keymap file, which I'm doing at
> the moment.
>
> For example, a file with the content æøå would yield the output
> "0x00E6 0X00F8 0X00E5", and the Japanese expression あの人 would yield
> "0x3042 0x306E 0x4EBA".
>
> Can anyone tell me how to do it without making a program for it
> myself? It would be VERY helpful, and I've already made 2 programs for
> assembling this file and I'm not starting on another just yet.
Perl 5.8 allows you to do so in one liner;
perl -MEncode -ple '$_=join(" ",map {sprintf "0x%04X", $_} unpack("U*",
decode("utf8",$_)))'
A more descriptive script is as follows;
#
use strict;
use Encode;
while(<>){
chomp $_;
my $line = decode("utf8" => $_);
my (@chars) = unpack("U*" => $line);
my (@hexed) = map {sprintf "0x%04X", $_} @chars;
my $hexed = join(" " => @hexed);
print $hexed, "\n";
}
__END__
Even funkier example.
#
package Encode::Hex;
use strict;
use base qw(Encode::Encoding);
__PACKAGE__->Define('hex');
sub encode($$;$){
my ($obj, $str, $chk) = @_;
my @hexed =
map {$_ == ord("\n") ? chr($_) : sprintf "0x%04X", $_}
unpack("U*" => $str);
$_[1] = '' if $chk;
return join(" " => @hexed);
}
package main;
binmode STDIN => ":utf8";
binmode STDOUT => ":encoding(hex)";
while(<>){
chomp;
print $_, "\n";
}
__END__
Dan the (Perl5 Porter|Encode Maintainer)
This archive was generated by hypermail 2.1.5 : Tue Mar 04 2003 - 15:35:18 EST