Asmus Freytag <asmusf@ix.netcom.com>:
AF> You apparently don't seem to realize that SCSU bridges the gap between
AF> an 8-bit based LZW and a 16-bit encoded Unicode text, by removing the
AF> extra redundancy that is part of the endoding (sequences of every other
AF> byte being null) and not a redundancy in the content. The output of SCSU
AF> should be sent to LZW for block compression where that's desired.
After re-reading the SCSU (sorry for the typo) definition, I realise
that the use of SCSU as a predictor for LZW or arithmetic coding does
indeed make sense. Contrary to what I said earlier, I have convinced
myself that using the SCSU in this manner might be a significant win
for some scripts.
This fully answers my question about the rationale for the SCSU.
AF> Another design point of SCSU is that it is editable (you can
AF> replace a piece in the middle, w/o having to change the stuff at
AF> the beginning or the end.)
Agreed, although you need to carefully keep track of the state of the
encoder when you do this.
AF> Another factor is probably (I didn't check this) the different
AF> ability to do semi random accesses into the middle of compressed
AF> text.
I don't think this is important. If you encode each character name
separately, you get as much random access ability as you might
reasonably need. Most character names are a dozen or so characters
long, and decoding all of one name in order to access a substring is
a most reasonable approach.
AF> PS:) coders will be coders, they like to invent new coding schemes
He nodded, wiping a tear with a distraught gesture.
Thanks for your answers,
J.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT