From: jcowan@reutershealth.com
Date: Tue Dec 09 2003 - 13:16:05 EST
Peter Kirk scripsit:
> No, surely not. If the wcslen() function is fully Unicode conformant, it
> should give the same output whatever the canonically equivalent form of
> its input.
Not so. Remember, the conformance requirement is not that a process can't
distinguish between canonically equivalent strings (otherwise a normalizer
would be impossible; it wouldn't know whether to normalize or not!) but that
a process can't assume that *other* processes will distinguish between
canonically equivalent strings. Equally, it can't assume that the other
process will fail to distinguish them, either.
In an environment in which C wide characters are Unicode characters, then
wcslen returns the number of distinct characters in the literal string.
How many characters it contains depends on how many were placed in the
source file by the author and what, if anything, has happened to the source
file since.
-- As you read this, I don't want you to feel John Cowan sorry for me, because, I believe everyone jcowan@reutershealth.com will die someday. -- From a Nigerian-type http://www.reutershealth.com scam spam I got http://www.ccil.org/~cowan
This archive was generated by hypermail 2.1.5 : Tue Dec 09 2003 - 13:52:39 EST