From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Nov 26 2003 - 07:09:57 EST
On 25/11/2003 16:38, Doug Ewell wrote:
>Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
>
>
>
>>So SCSU and BOCU-* formats are NOT general purpose compressors. As
>>they are defined only in terms of stream of Unicode code points, they
>>are assumed to follow the conformance clauses of Unicode. As they
>>recognize their input as Unicode text, they can recognize canonical
>>equivalence, and thus this creates an opportunity for them to consider
>>if a (de)normalization or de/re-composition would result in higher
>>compression (interestingly, the composition exclusion could be
>>reconsidered in the case of BOCU-1 and SCSU compressed streams,
>>provided that the decompression to code points will redecompose the
>>excluded compositions).
>>
>>
>
>I have to say, if there's a flaw in Philippe's logic here, I don't see
>it. Anyone?
>
>-Doug Ewell
> Fullerton, California
> http://users.adelphia.net/~dewell/
>
>
>
Yes, the compressor can make any canonically equivalent change, not just
composing composition exclusions but reordering combining marks in
different classes. The only flaw I see is that the compressor does not
have to undo these changes on decompression; at least no other process
is allowed to rely on it having done so.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Wed Nov 26 2003 - 08:01:13 EST