CVTUTF.C bug & question

From: Oliver Steinau (oliver.steinau@STencode.de)
Date: Wed Aug 23 2000 - 03:46:47 EDT


I have a question concerning the CVTUTF.C file that is on the CD in the
Unicode 3.0 book. There's a piece of code which I don't think is correct...

Function ConvertUTF8toUTF16 contains the following piece of code (line 210):
[ch is the result of the conversion so far]:

                if (ch <= kMaximumUCS2) {
                        *target++ = ch;
                } else if (ch > kMaximumUCS4) {
                        *target++ = kReplacementCharacter;
                } else {
                        if (target + 1 >= targetEnd) {
                                result = targetExhausted; break;
                        };
                        ch -= halfBase;
                        *target++ = (ch >> halfShift) + kSurrogateHighStart;
                        *target++ = (ch & halfMask) + kSurrogateLowStart;
                };

with kMaximumUCS2 = 0xffff and kMaximumUCS4 = 0x7fffffff.

Shouldn't the first comparison read "if (ch <= kMaximumUTF16)..."?

In addition, function ConvertUTF8toUCS4 is a **COPY** of ConvertUTF8toUTF16,
which sure isn't what it's intended to be. To correct this, would it be
correct to just replace the above code with

        if (ch > kMaximumUCS4) {
                ch = kReplacementCharacter;
        }
        *target++ = ch;

? It would be great if someone could comment on this...

Thanks a lot,

/oliver





This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT