From: Peter Kirk (peterkirk@qaya.org)
Date: Tue Aug 10 2004 - 13:43:06 CDT
On 10/08/2004 18:33, Jon Hanna wrote:
> ...
>
>As for modern markup, consider if instead of ̄ you had ̸
>By the rules of XML that is treated as if the character U+0338 was there rather
>than the escape sequence.
>By the rules of Unicode the sequence U+003E, U+0338 is treated the same as the
>character U+226F.
>By the rules of XML replacing ≯ with U+226F would mean the document was
>no longer well-formed.
>
>So even without an explicit spec saying otherwise the above would be
>problematic.
>
>
>
This means that the rules of XML conflict with the rules of Unicode. If
the string is a Unicode string, U+226F is canonically equivalent to
<U+003E, U+0338> and therefore any higher level protocol should treat
the two sequences as identical, rather than reject one of them as
causing the document to be ill-formed.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Tue Aug 10 2004 - 13:43:31 CDT