From: Kenneth Whistler (kenw@sybase.com)
Date: Mon May 07 2007 - 17:06:50 CDT
> > Adam Twardoch wrote:
> > ... would make as little sense as encoding the
> >> uppercase "ß" as "S ZWJ S".
But of course stating that way distorts the sense of the argument,
anyway. The counterproposal is to say that given existing
Unicode conventions, one could simply say that in those minority
contexts where one wishes to display an <S, S> sequence as
an uppercase [ß], use of a ZWJ to maintain a plain text distinction
and a ligature from a font for presentation could suffice.
That isn't *encoding* uppercase [ß] as "S ZWJ S"; it is
displaying <S, ZWJ, S> with a ligature uppercase [ß] glyph.
And John Hudson's argument about this is that using existing
mechanisms might work better as a practical matter, because
it has graceful fallback behavior.
But those advocating *for* uppercase [ß] don't seem to be
making practical arguments here, as best I can tell. The
argumentation is *essentialist* in nature: uppercase [ß] *is*
a letter, not a ligature, *therefore* it *must* be encoded
as a character.
I've been around the bend enough times to realize there isn't
much mileage to be gained in trying to argue down
essentalists, but I would like them to at least consider
the parallel with folks who have been arguing for years,
for example, that "ksa" in Devanagari *is* a letter, and therefore
must be encoded as a character.
> >> I strongly believe that "SS" is an anachronic, still-in-use but
> >> slowly-to-vanish poor man’s solution to write the uppercase "ß".
I'm perfectly willing to accede that writing systems change,
and the status of elements within them may change diachronically.
There are plenty of such examples in the Latin script, as we
all know. And it may well be that ß is in the middle of such
a transition. As Asmus noted, its "letterhood" is now officially
recognized in the German orthography, and as Adam and others
talking about the nature of Latin as a bicameral script have
been wont to point out, that means growing pressure for it
to acquire an uppercase form, whether we like it or not. Certainly
this echoes the process whereby many lowercase IPA use letters
have acquired uppercase forms by dint of usage in language
orthographies.
But Adam here is talking as if the future course of history
here is predestined. There apparently is a camp of people
who think that not only is uppercase [ß] a letter and
deserving of encoding as a character, but it will inevitably
be reckoned as the rightful uppercase mapping of ß, with
further attendant changes to formal orthographic rules.
John Hudson responded:
> > I suspect, and indeed hope, that you are right. ...[but] having a
> > single lowercase character with two different uppercase mappings, one
> > currently standard and enshrined in existing casing rules and
> > implementations, one that might one day become standard and require
> > some kind of overriding implementation, seems to me a bit of a
> > standardisation and software development nightmare.
> >
And Asmus replied:
> The 'nightmare' is not with the characters, but with the potential that
> officially sanctioned rules might change.
... which Adam has as much as said is the future course of history.
But I don't think Asmus' pooh-poohing the concerns of John about
the character implementation issue does justice to the real
issues here.
The proposal formally suggests that uppercase [ß] get a lowercase
mapping to ß, but that, for stability, ß not get an uppercase
mapping to uppercase [ß]. That would be, to the best of my knowledge,
an unprecedented kind of case mapping in the UCD, and has its
own stability issue: there will be *years* of carping and rabblerousing
that will follow on from that decision, as the camp which believes
that the natural, self-evident, and essential casemapping
relations should be:
ß <--> uppercase [ß]
ss <--> SS
will attempt to get the UnicodeData case mappings (and implementations
that follow from that) and case foldings "fixed" to reflect that
inevitable rightness.
But any changes in such a direction *are* the kind of software
development nightmare that John Hudson is warning about.
I won't bother trying to get them to pledge that they won't ask
for that, because they may well say so now (as the proposal does),
but then simply turn around and ask for the changes anyway.
Asmus went on to say:
> There's absolutely nothing
> that can prevent such a change, even if it were not to involve new
> characters. For example, assume that the solution of using 'SZ' in
> contrast to 'SS' became official. It would equally invalidate all
> software and throw confusion even into (fuzzy) search and sorting, with
> the potential of dragging lower case 'sz' into the fray.
No doubt that would be the case.
>
> That's why the proposers, correctly in my opinion, did not base their
> proposal on speculation on the direction of potential future reform, but
> limited themselves to documenting the existing usage, which clearly can
> be supported and deserves to be supported.
But I just don't buy that argument. The "existing usage" can
be supported with existing characters and with properly designed
fonts, actually. I think this comes back down to the essentialist
argument again. There is a group of German users and scholars
who believe that uppercase [ß] *is* a character, and it is
*that* which deserves to be supported, apparently.
I have yet to see cogent technical arguments for what real
issues are being addressed here, other than the need to *display*
uppercase [ß] glyphs on demand. The text processing arguments
have all been mumbo-jumbo and handwaving so far.
Furthermore, while the proposers may not have "base[d] their
proposal on speculation on the direction of potential future
reform", it is pretty clear from the discussion on this list
that the decision to encode an uppercase [ß] is smack in the
middle of such speculation, and encoding it will be used as
a lever to make further changes. Hence the (overly) passionate
opposition, as well as the (overly) passionate support for the
proposal, in my opinion.
> I remember writing before somewhere that I think their proposal should
> be accepted as presented.
Ah, but it has been awhile since I've seen a single character
encoding proposal engender this much debate and controversy.
It may well be accepted as presented, but it is unlikely to
do so with any clear consensus.
--Ken
This archive was generated by hypermail 2.1.5 : Mon May 07 2007 - 17:07:55 CDT