Re: Text in composed normalized form is king, right? Does anyone generate text in decomposed normalized form?

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Wed, 6 Feb 2013 10:18:33 +0100

2013/2/5 Richard Wordingham <richard.wordingham_at_ntlworld.com>:
> On Tue, 5 Feb 2013 12:16:47 +0100
> Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:
>
>> A process can be FULLY conforming by preserving the canonical
>> equivalence and treating ALL strings that are canonically equivalent,
>> without having to normalize them in any recommanded form,...
>
> Try doing UCA collation with <U+0302 COMBINING CIRCUMFLEX ACCENT,
> U+0067 LATIN SMALL LETTER G> being a collation element (with arbitrary
> collation elements) without doing normalisation.

<0302, 0067> is defective, and its normalisation is still <0302,
0067>, it is NOT canonically equivalent to <0067, 0302>

I was not speaking about arbitrary collation elements containing
defective sequences, is is a real case ?

Consider how you
> would handle <U+011D LATIN SMALL LETTER G WITH CIRCUMFLEX, U+011D,
> U+011D>!

with which collation rule set ? including defective collection elements ?
Received on Wed Feb 06 2013 - 03:25:32 CST

This archive was generated by hypermail 2.2.0 : Wed Feb 06 2013 - 03:25:40 CST