Re: UTF16 <=> Reuters format?

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Wed Sep 30 1998 - 16:21:49 EDT


At 09:43 AM 9/30/98 -0700, Roman wrote:

>I think that initialDynamicOffset[1] in SCSU.java has to be changed
>from 0x0100, // Latin Extended A
>into 0x00C0, // combined partial Latin-1/-A
>to align it with http://www.unicode.org/unicode/reports/tr6.html
>but I haven't heard the final word on this.

Thanks for sending this again. You raised that when I was away over
the summer, but I could not locate the details in my inbox.
 I'll look into that and make the appropriate changes.

>The SCSU/*.java user interface is a bit object-oriented:

The driver program is indeed a bit barebones.

>You could easily change its putwchar function to output UTF-16 instead
>of UTF-8, see http://czyborra.com/utf/#UTF-16

This link fails for me.

>I did not yet program a compressor to SCSU because it is probably an
>exercise in combinatorial optimization to do that well and I currently
>find other questions more important.

I wish there would be someone who could share another implementation of
a compressor, even a non-optimal one. For testing the inflator, it's
merely required that the compressor use all features of SCSU at some
point.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT