From: Elliotte Rusty Harold (elharo@metalab.unc.edu)
Date: Tue Jan 20 2004 - 14:45:12 EST
At 9:52 AM -0800 1/20/04, Markus Scherer wrote:
>You need not invent something new: Just use a simplified SCSU
>encoder, and either a regular SCSU decoder or one that only supports
>the features which your custom encoder uses.
Thanks. It looks like exactly what I need.
>For a tiny SCSU encoder (main function 75 lines of commented C) that
>also compresses a little better than what you describe see
>http://www.mindspring.com/~markus.scherer/unicode/tr6/
>
>You could scale that encoder up or down to your liking.
>
>For a full SCSU converter you could use ICU, for example.
>http://oss.software.ibm.com/icu/
Hmm, I'm already carrying around part of ICU4J to perform
normalization. I'll have to check and see if I've got the SCSU
support compiled into my version of the ICU jar.
>You could also use BOCU-1.
Reading the BOCU tech note, it looks like SCSU performs better, The
main benefit of BOCU is if you're transmitting this encoding on the
wire, which I am definitely not doing. But SCSU looks like a really
nice option. Thanks.
-- Elliotte Rusty Harold elharo@metalab.unc.edu Effective XML (Addison-Wesley, 2003) http://www.cafeconleche.org/books/effectivexml http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
This archive was generated by hypermail 2.1.5 : Tue Jan 20 2004 - 16:33:05 EST