Seems like I missed the isLegalUTF8 function calls that verified
if the UTF was valid UTF8, nevermind then, its all OK.
On Wednesday, July 17, 2002, at 01:57 , Theodore H. Smith wrote:
> The file ConvertUTF.c contains this array:
>
>
> static const char trailingBytesForUTF8[256] = {
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
> 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5
> };
>
> Doesn't UTF8 only have 4 bytes maximum? So then the entries
> above 3 should not be there.
>
> There could be similar mistakes with 6 byte UTF8 codes. I think
> this file may have been written before UTF8 was tightened up.
> Perhaps this code should be tightened up along with the
> standard now?
>
-- Theodore H. Smith - Macintosh Consultant / Contractor. My website: <www.elfdata.com/> >
This archive was generated by hypermail 2.1.2 : Tue Jul 16 2002 - 19:35:44 EDT