From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Fri Sep 22 2006 - 15:52:13 CDT
Steve Summit wrote on Friday, September 22, 2006 8:02 PM
> Other than the (granted) incompatibility with string constants,
> the rest of Philippe's assertions about unsigned int's alleged
> unsuitability for manipulating Unicode characters in the BMP are
> incorrect, but I'll spare the list a point-by-point rebuttal.
Not quite. Unsigned int is only guaranteed a range of 0 to 0xffff and
therefore it can't normalise the string <U+FAD5> - the normalised form is
<U+25249> in all four normalisations. Of course, unsigned int is good
enough to hold UTF-16 code *units*, which might just be what Mike meant.
(I.e., the type supports UTF-16, but not UTF-32.)
Of course, you may be able to create Unicode string constants - it all
depends what data structure is used. FFFF-terminated arrays would work,
e.g.
static const unsigned int[] remark = {
LATIN_L, LATIN_o, LATIN_o, LATIN_k, EXCLAMATION_MARK, 0xffff}
Richard.
This archive was generated by hypermail 2.1.5 : Fri Sep 22 2006 - 15:58:55 CDT