From: James Kass (thunder-bird@earthlink.net)
Date: Tue Sep 04 2007 - 10:41:00 CDT
Quoting from
http://unicode.org/Public/UNIDATA/StandardizedVariants.html
"All combinations not listed in StandardizedVariants.txt are
unspecified and are reserved for future standardization;
no conformant process may interpret them as standardized
variants."
Displaying invalid sequences using visible control pictures in plain-text
editing environments is conformant.
Not highlighting invalid sequences in some fashion means that they are
being treated/interpreted exactly as though they were valid
standardized variant sequences.
If, by providing control picture glyphs for VS characters, a font
developer enables an otherwise conformant application, like Notepad,
to remain conformant, where's the harm?
Suppose, in plain-text, I wanted to point out something about VS characters
and standardized sequences, like:
∩ + ︀ = ∩︀
In order to get a sensible display, we'd need to be able to display that
character in isolation. That's really all that the control picture is doing.
Perhaps one of the most common uses on the web right now for VS
characters is to populate Unicode HTML charts. Again, this calls for
a control character to be displayed in isolation. How to display these
characters in isolation is up to the combination of the browser, the
rendering engine, and the selected font.
(Some web pages do not use an NCR to provide a display glyph for the
VS characters, but some pages do.)
If we find the above example on a web page and open the web page source
in Notepad, we might see something like:
∩ + ︀ = ∩︀
Why should the display make *less* sense if we convert the NCRs to UTF-8?
Best regards,
James Kass
This archive was generated by hypermail 2.1.5 : Tue Sep 04 2007 - 10:43:07 CDT