Re: HTML Validation (was Re: Clean and Unicode compliance)

From: Elliotte Rusty Harold (elharo@metalab.unc.edu)
Date: Sun Dec 16 2001 - 07:17:54 EST

Previous message: James Kass: "HTML Validation (was Re: Clean and Unicode compliance)"
In reply to: James Kass: "HTML Validation (was Re: Clean and Unicode compliance)"
Next in thread: James Kass: "Re: HTML Validation (was Re: Clean and Unicode compliance)"
Next in thread: Curtis Clark: "Plane One use, was Re: HTML Validation"
Reply: James Kass: "Re: HTML Validation (was Re: Clean and Unicode compliance)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 3:07 AM -0800 12/16/01, James Kass wrote:

>Tests run on non-BMP text show no problem for Plane One using
>UTF-8 encoding but error messages are generated when these
>characters are referenced as NCRs.
>

I suspect there's a lot of random mistakes like this waiting to be
discovered. I recently added a Plane-1 musical symbol to a book I'm
working on, and watched Xerces's XMLSerializer class trip over it. It
emitted the character as two character references, one for each half
of the surrogate pair, rather than one, thus producing malformed
HTML. It worked when I switched to UTF-8 encoding though.

I suspect a lot of our tools haven't been thoroughly tested with
PLane-1 and are likely to have these sorts of bugs in them.

+-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http://www.ibiblio.org/xml/books/bible2/ | | http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java news: http://www.cafeaulait.org/ | | Read Cafe con Leche for XML news: http://www.ibiblio.org/xml/ | +----------------------------------+---------------------------------+

Previous message: James Kass: "HTML Validation (was Re: Clean and Unicode compliance)"
In reply to: James Kass: "HTML Validation (was Re: Clean and Unicode compliance)"
Next in thread: James Kass: "Re: HTML Validation (was Re: Clean and Unicode compliance)"
Next in thread: Curtis Clark: "Plane One use, was Re: HTML Validation"
Reply: James Kass: "Re: HTML Validation (was Re: Clean and Unicode compliance)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Sun Dec 16 2001 - 09:55:12 EST