Re: Japanese EUC and Shift-JIS text samples?

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Mon Oct 04 1999 - 17:00:33 EDT


> HTML could also be treat as plain text from converter point of view,
> right ?
>
In a way, but it has a lot of ASCII characters that would not normally
be found in Japanese text. Plain Japanese text without markup would be
better. Maybe some kind of "full text" archive of literature such as
we have in the USA at university libraries?

How about newsgroup archives? (I think JIS-7 is used for newsgroups?
Or ISO 2022-JP?)

> http://home.netscape.com/ja for Shift_JIS
> http://www.yahoo.co.jp/ for EUC-JP
>
Either my Shift-JIS parser is wrong, or none of these web pages has
any halfwidth Katakana. But since I have only a USA Windows-95 PC
with Netscape for viewing, all I see is little boxes anyway, so I have
no way of knowing what I'm looking at (even if I *could* read
Japanese :-)

Thanks!

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT