From: Mark Davis (mark.davis@jtcsv.com)
Date: Fri Dec 05 2003 - 13:03:18 EST
> OK. So it's Mark, not me, who is unilaterally extending C10.
Where on earth do you get that? I did say that, in practice, NFC should be
produced, but that is simply a practical guideline, independent of C10.
Mark
__________________________________
http://www.macchiato.com
► शिष्यादिच्छेत्पराजयम् ◄
----- Original Message -----
From: "Peter Kirk" <peterkirk@qaya.org>
To: "Doug Ewell" <dewell@adelphia.net>
Cc: "Unicode Mailing List" <unicode@unicode.org>
Sent: Fri, 2003 Dec 05 02:51
Subject: Re: Compression through normalization
> On 05/12/2003 00:34, Doug Ewell wrote:
>
> >Peter Kirk <peterkirk at qaya dot org> wrote:
> >
> >
> >
> >>Surely ignoring Composition Exclusions is not unilaterally extending
> >>C10. The excluded precomposed characters are still canonically
> >>equivalent to the decomposed (and normalised) forms. And so composing
> >>a text with them, for compression or any other purpose, still conforms
> >>to C10, which explicitly allows "replacement of character sequences by
> >>their canonical-equivalent sequences" - not only when the resulting
> >>sequence is NFC or NFD.
> >>
> >>
> >
> >Ignoring the composition exclusions does still respect canonical
> >equivalence, but does not preserve a canonical normalization form (using
> >the language of UAX #15). So although it is not a violation of C10, it
> >does seem to run afoul of Mark's recommendation:
> >
> >"In practice, if a compressor does not produce codepoint-identical text,
> >it should produce NFC
> >(not just any canonically equivalent text), and should document that it
> >does so."
> >
> >
> >
> >
> OK. So it's Mark, not me, who is unilaterally extending C10. Well, Ken
> said much the same, so it's bilateral; and I agree it is a sensible
> extension.
>
> But, as Ken also pointed out, it is quite permissible to use any
> encoding for the intermediate e.g. compressed form of the text, as long
> as it is possible to recover from this the normalised form of the
> original text. My suggestion of composing the text using composition
> exclusions meets this test, in a way not met by some of the other
> suggestions, e.g. composing Korean characters into precomposed forms
> which are (sadly) not canonically equivalent.
>
> --
> Peter Kirk
> peter@qaya.org (personal)
> peterkirk@qaya.org (work)
> http://www.qaya.org/
>
>
>
>
This archive was generated by hypermail 2.1.5 : Fri Dec 05 2003 - 14:05:41 EST