From: Doug Ewell (dewell@adelphia.net)
Date: Tue Nov 25 2003 - 19:38:03 EST
Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
> So SCSU and BOCU-* formats are NOT general purpose compressors. As
> they are defined only in terms of stream of Unicode code points, they
> are assumed to follow the conformance clauses of Unicode. As they
> recognize their input as Unicode text, they can recognize canonical
> equivalence, and thus this creates an opportunity for them to consider
> if a (de)normalization or de/re-composition would result in higher
> compression (interestingly, the composition exclusion could be
> reconsidered in the case of BOCU-1 and SCSU compressed streams,
> provided that the decompression to code points will redecompose the
> excluded compositions).
I have to say, if there's a flaw in Philippe's logic here, I don't see
it. Anyone?
-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 20:14:20 EST