From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Oct 12 2004 - 12:25:16 CST
From: "Doug Ewell" <dewell@adelphia.net>
> Theodore H. Smith <delete at elfdata dot com> wrote:
>
>>> - the file mixes UTF-8 and UTF-16
>>
>> Does this file mix UTF-8 and UTF-16? I thought it just had surrogates
>> encoded into UTF-8? Of course a surrogate should never exist in UTF-8.
>
> You are right. Philippe's statement was incorrect, and also puzzling.
Have you read the file content? It clearly and explicitly speaks about
UTF-16, which has nothing to do in a text file for UTF-8, unless the file
was used as a test for CESU-8 (which is not UTF-16 as well, and not even
UTF-8). My statement was correct: it is based on the fact that the test file
was created for the older (RFC version) of UTF-8 used in old versions of ISO
10646, and never referenced (at least explicitly until the v4.01
clarification) by Unicode in any version.
This archive was generated by hypermail 2.1.5 : Tue Oct 12 2004 - 12:30:00 CST