From: Arcane Jill (arcanejill@ramonsky.com)
Date: Thu Dec 11 2003 - 11:41:47 EST
I think Marco here has the definitive answer. I've thought about this a
lot, and it seems to me that he's right.
A /consequence/ of this appears to be that it DOESN'T MATTER whether or
not a text editor normalises C or C++ source code, into either NFC or
NFD. It shouldn't make the slightest bit of difference /unless/ the
program has been very sloppily (i.e. badly) written. Whether or not, or
how, normalised is the source code, it will still compile to an
executable whose behavior is what the programmer wants. (Of course, one
can _contrive_ examples where it makes a difference, but in general, the
primary reasons for wanting to know the number of wchar_ts in a string
is so that you can reserve the right amount of storage space, not so
that you can control the flow of execution on that basis).
Thus, I'm now becoming convinced that normalising Unicode plain text
would a reasonable feature for a text editor to offer. (Even in XML
documents, it would only affect /one character/, if I've understood this
thread correctly).
Jill
> -----Original Message-----
> From: Marco Cimarosti [mailto:marco.cimarosti@essetre.it]
> Sent: Tuesday, December 09, 2003 6:14 PM
> To: 'Arcane Jill'; unicode@unicode.org
> Subject: RE: Text Editors and Canonical Equivalence (was Coloured
> diacriti cs)
>
>
> The answer is:
>
> int n = wcslen(L"café");
>
> That's why you take the burden to call the "wcslen" library
> function rather
> than assuming a hard-coded value such as:
>
> int n = 4; // the length of string "café"
This archive was generated by hypermail 2.1.5 : Thu Dec 11 2003 - 12:35:40 EST