Re: Text Editors and Canonical Equivalence (was Coloured diacritics)

From: jcowan@reutershealth.com
Date: Tue Dec 09 2003 - 13:16:05 EST

Next message: Anupam Agarwal: "Unsubscribe"

Previous message: Mark Davis: "Overload (was Re: Text Editors and Canonical Equivalence (was Coloured diacritics))"
In reply to: Peter Kirk: "Re: Text Editors and Canonical Equivalence (was Coloured diacritics)"
Next in thread: Peter Kirk: "Re: Text Editors and Canonical Equivalence (was Coloured diacritics)"
Reply: Peter Kirk: "Re: Text Editors and Canonical Equivalence (was Coloured diacritics)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Peter Kirk scripsit:

> No, surely not. If the wcslen() function is fully Unicode conformant, it
> should give the same output whatever the canonically equivalent form of
> its input.

Not so. Remember, the conformance requirement is not that a process can't
distinguish between canonically equivalent strings (otherwise a normalizer
would be impossible; it wouldn't know whether to normalize or not!) but that
a process can't assume that *other* processes will distinguish between
canonically equivalent strings. Equally, it can't assume that the other
process will fail to distinguish them, either.

In an environment in which C wide characters are Unicode characters, then
wcslen returns the number of distinct characters in the literal string.
How many characters it contains depends on how many were placed in the
source file by the author and what, if anything, has happened to the source
file since.

-- 
As you read this, I don't want you to feel      John Cowan 
sorry for me, because, I believe everyone       jcowan@reutershealth.com
will die someday.    -- From a Nigerian-type    http://www.reutershealth.com
                        scam spam I got         http://www.ccil.org/~cowan

Next message: Anupam Agarwal: "Unsubscribe"
Previous message: Mark Davis: "Overload (was Re: Text Editors and Canonical Equivalence (was Coloured diacritics))"
In reply to: Peter Kirk: "Re: Text Editors and Canonical Equivalence (was Coloured diacritics)"
Next in thread: Peter Kirk: "Re: Text Editors and Canonical Equivalence (was Coloured diacritics)"
Reply: Peter Kirk: "Re: Text Editors and Canonical Equivalence (was Coloured diacritics)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Dec 09 2003 - 13:52:39 EST