From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Wed Aug 09 2006 - 05:45:14 CDT
On Wed, 9 Aug 2006, Andrew West wrote:
> I guess that most people know that I have always been a strong
> advocate of only encoding characters for which there is tangible
> evidence for their existence and need to be encoded.
The key question here is what existence means: does it mean existence as a
widely recognized symbol with some established shape, or does it _also_
require existing usage in texts? (For some definition of "text", of
course. But surely symbols that are _only_ used as standalone graphic
symbols lack use in texts.)
> However, the
> nature of symbols mean that they may not commonly occur in plain text
> contexts, despite being widely occuring and very well known (e.g.
> Christian ichthys symbol or Chinese double happiness symbol).
Then they should not be encoded as _characters_.
> I believe that there is a utility in encoding widely occuring symbols,
> irrespective of their use in traditional text contexts, as many people
> would find it useful to be able to represent such symbols as
> characters rather than as images (primarily on web pages I imagine).
That's _not_ a good argument in favor of encoding them. Rather, the
contrary. On web pages, it is generally more successful to use an image
instead of a special character for any symbol that is not commonly used in
texts, even if it has been encoded in Unicode. The reason is that using
an image, you get the desired shape in most browsing situations,
irrespectively of font issues, _and_ you can specify a textual fallback
for non-graphic browsers (and for graphic browsers used in no-images
mode). For example:
<img src="ichthys.gif" alt="Christian" title=
"The ichthys (fish) symbol, used to refer to Christ and Christianity">
(Your mileage may vary; you can use the alt="..." attribute to specify
whatever you deem appropriate to convey your message in cases where the
image, and probably no images and no graphic symbols at all, can be
displayed.) Using a hypothetical ichthys character, you could enter it
(simply as a character e.g. in UTF-8, or as a numeric character
reference), but you would have no way to specify a fallback for situations
where it will not be rendered. This would be an essential limitation,
since even if assume that the symbol were encoded in Unicode today, it
would take several years before it would be seen in most browsing
situations, and many more years before it would work reasonably
universally.
Web pages are (normally) not plain text but marked-up text, with tools for
including images. Hence, you can't really use them as arguments in favor
of encoding graphic symbols as characters.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Wed Aug 09 2006 - 05:53:47 CDT