I'll answer this one.
On 2002.02.02, at 03:28, Yves Arrouye wrote:
> That is understandable if they use different tables. The question is
> which
> one is the "right" EUC-JP, and which one do users want? ICU, as well as
> iconv, could have two tables with the different mappings. The question
> then
> is how to label them, and whether the labeling should be compatible
> between
> the two.
I don't know which one is 'right'. But most practical and widely-used
(euc-jp) is as follows;
\x00 - \x7f Maps to US-ASCII
\xa1a1 - \xfefe Maps to JISX-0208 (aka Zenkaku)
\x8ea1 - \x8edf Maps to JISX-0201 (aka Hankaku)
In addition, extended form of euc-jp also includes;
\x8fa1a1 - \x8ffefe Maps to JISX-0212
That's what iconv, Tcl's *.enc, and my humble Jcode think what euc-jp
is.
> I find the same statement confusing. Are you saying that uconv's UTF-8
> is
> ill-formed? Nick, Would you mind email me (and just me, not the list)
> your
> table.euc sample file?
Go get Jcode.pm via http://search.cpan.org/search?dist=Jcode and check
under t/ directory. You can find table.euc and x0212.euc.
Dan
This archive was generated by hypermail 2.1.2 : Fri Feb 01 2002 - 14:30:17 EST