Mark Davis 🍱️ <mark at macchiato dot com> wrote:
>> TUS 8.0 Chapter 3 C6: "A process shall not assume that the
>> interpretations of two canonical-equivalent character sequences are
>> distinct."
>
> A compiler will take source code containing String x="á"; and compile
> it to a certain binary. If that same source code is NFD'd, the
> compiler will produce a different result.
>
> Do you really think that such compiler is not compliant to Unicode??
> If so, then we should add some more clarifications around C6.
I agree. The word "interpretations" in C6 can't have been intended to
include the interpretation of code points qua code points. That would
make a great many internal processes impossible.
I think of C6 as meaning that spell-checkers, for example, should not
treat José (NFC, four code points) and José (NFD, five code points)
as separate entries.
-- Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸Received on Tue Sep 08 2015 - 10:20:12 CDT
This archive was generated by hypermail 2.2.0 : Tue Sep 08 2015 - 10:20:12 CDT