On Oct 21, 3:51am, unicode@Unicode.ORG wrote:
> Subject: Re: Unicode-capable compression software
> From: NAME: Misha Wolf
>
> I had assumed that traditional compression algorithms looked for repeats
> on an 8-bit basis and, hence, would fail to compress Unicode. Is this
> assumption correct/incorrect?
>
> John - Please do send me your paper.
>
> Many thanks.
>-- End of excerpt from unicode@Unicode.ORG
The compressions do work on an 8-bit basis, but looking at Unicode text as a
sequence of bytes will still find a lot of pattern. It just doesn't do as
good a job as it would if it dealt with 16-bit chunks.
I am sorry I can't send you my paper, since I no longer have a copy. I left it
at my last job (where I wrote it). If you contanct Steve Greenfield at the
Unicode office ((408) 777-5870 or unicode-inc@unicode.org) he can get you
copies
of the procedings for the UIWs.
John Bennett
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:32 EDT