From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Sep 20 2004 - 13:21:19 CDT
At 06:09 PM 9/19/2004, D. Starner wrote:
>Asmus Freytag writes:
> > Given
> > the nature of the symbol in question, I would personally see no reason to
> > object
> > to encoding it - especially given the current and projected lack of
> > availability
> > of other alternatives.
>
>It's a simple combining character. Even if you can't do arbitrary circles
>around characters, you can take one character sequence and map it to the
>glyph in a font. Systems that can't do even that need to be fixed.
In other words, you would like to treat this as a mandatory ligature.
To make this work in interchange, we need to get the buy-in from enough
platform, application and font vendors that they want to support this and
similar characters in that way (and fix their products where necessary).
If we can get that kind of buy-in, then we could add this and other special
purpose circled characters via the new "named sequences".
Lacking such buy-in, the addition of these as characters becomes more
appealing.
The problem here is that we have a proven track record that implementers
*have* supported additions to the character repertoire by expanding their
fonts. We do *not* have a proven track record of implementers widely
supporting special layout features, other than the core requirements for
a given script.
However, since we don't want to continue to encode accented characters
because of normalization, we are adding named sequences to the standard,
so that users can identify required sequences of characters and accents
by referring to sequence identifiers. Your suggestion logically implies
the extension of that process.
In the case of symbols like copyright, we do have a precedent of encoding
these outright and to *not* normalize them. (C) and circled C are not
identical as Unicode stands today. This is similar to currency symbols.
However, unlike currency symbols, which are in extremely common use on
all sorts of embedded platforms, and where therefore single character
codes can be an advantage, most of the symbols like (Wz) etc. are much
less widely used and could indeed be handled as above (and recognized
as named sequences).
From the perspective of the users, this solution would be more appealing
if we had the buy-in from major vendors to actively support this approach.
A./
PS for named sequences:
See: http://www.unicode.org/reports/tr34
Draft Data:
http://www.unicode.org/Public/4.1-Update/NamedCompositeEntities-4.1.0d4.txt
(the last part of the file name may change to NamedSequences*.txt).
A./
This archive was generated by hypermail 2.1.5 : Mon Sep 20 2004 - 13:23:51 CDT