Re: Unicode Search Engines

From: Michael Everson (everson@evertype.com)
Date: Wed Feb 20 2002 - 20:09:03 EST


At 13:21 -0500 2002-02-20, John Cowan wrote:
>Marco Cimarosti scripsit:
>
>> But, if there is no precomposed character for "q with tilde", then the
>> combining tilde *must* be maintained in all normalization forms.
>
>Correct.
>
>> Why? Isn't that what W3C asked?
>
>No. The W3C CharMod wants receivers to check normalization and
>reject unnormalized documents, *not* to normalize input.

What does such rejection imply? That an HTML document using UTF-8
declaring U+0041 U+0301 is acceptable but an HTML document using
UTF-8 declaring U+00C1 is not?

-- 
Michael Everson *** Everson Typography *** http://www.evertype.com



This archive was generated by hypermail 2.1.2 : Wed Feb 20 2002 - 20:03:39 EST