From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Nov 26 2003 - 07:21:37 EST
On 26/11/2003 02:29, Philippe Verdy wrote:
>jameskass@att.net wrote:
>
>
>>Briefly, it's my opinion that applications which claim to support
>>and comply with Unicode should not 'step on' Unicode text. Any
>>loopholes in the 'letter of the law' which allow applications to
>>mung or reject Unicode text should be plugged.
>>
>>
>
>If this "pluging" request must be done, it should be also the case for HTML
>and XML.
>For now, combining characters can be encoded directly just after a quote
>character (single or double) used to mark the beginning of an attribute
>value, or just after a tag-closing ">". HTML and XML parsers will parse
>these quotes or superior signs by ignoring the combining sequence, creating
>defective sequences, but this is a problem.
>
>...
>
>
Why is this a problem? Quotes and ">" with combining marks are
presumably not legal HTML or XML; and so the interpretation of a quotes
or ">" followed by combining marks as a quote or ">" and a defective
combining sequence is unambiguous, surely? There could of course be
problems if there were any precomposed combinations of quotes or ">"
with combining characters, but I don't think there are any, are there?
Your proposed solution to the problem is messy in requiring the use of
numeric entities, and unnecessary.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Wed Nov 26 2003 - 08:02:59 EST