Re: where can I find some unicode

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Mon Jul 26 1999 - 15:07:24 EDT


Am 1999-07-25 um 22:27 h hat Viranga Ratnaike geschrieben:
> Is there any freely available data encoded as either UTF-8 or UTF-16.

<http://titus.uni-frankfurt.de/unicode/unitest.htm#samples>
<http://pantheon.yale.edu/~jshin/faq/utf8_kr.html>

Am 1999-07-25 um 22:27 h hat Viranga Ratnaike geschrieben:
> But this is inconvenient as it's embedded in html.

You can try to get rid of the HTML tags, in two ways:

- Via cut-and-paste, e.g. mark the text in your browser, copy it,
  then paste it into your word processor, then store it as UTF-8,
  or UTF-16, plain text. I have tried this with Netscape Com-
  municator 4.05 and MS-Word 97, and it essentially works.

- Store a copy of the HTML file and open it in a HTML aware word
  processor, exploting the HTML input conversion; then store it
  as a Unicode plain text file. I have tried this with Netscape Com-
  municator 4.05 and MS-Word 97, and it essentially works.

In my first test, the original line-breaks from the HTML page were kept
in the plain text file. In my second test, the original paragraph-ends were
kept as line-breaks in the plain text, and a particular tag (viz. DIV) was
not removed. Your mileage may vary, depending on the software versions
used.

Best wishes,
   Otto Stolz



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:50 EDT