Re: Fixing Two Unicode Asymmetries in case conversion

From: Mark Leisher (mleisher@crl.nmsu.edu)
Date: Fri Nov 13 1998 - 12:03:57 EST


> - there is a codepoint for LATIN SMALL LETTER SHARP S (\u00df) which
> does not have any uppercase correspondent defined, but according to
> German language rules, you have to convert it into "SS" when you go
> to upper case. Since there is no dedicated codepoint for the "double
> S", you have to (1) take care of this special case explicitly in
> your case conversion code; (2) you must grow the string previously
> containing the Sharp S character to accommodate the extra
> character.

    Karl> As a native speaker of German, I strongly support the introduction
    Karl> of an uppercase eqiuvalent to U+00df "ß" (lowercase German sharp s)
    Karl> not only for the cited reason. While the "official" German
    Karl> orthographic rules do not know an uppercase "ß", you find MANY
    Karl> instances of such a letter in practical use. Look at nameplates or
    Karl> newspaper advertisements: You will find a lot of "ß" surrounded by
    Karl> capitals, especially when the "correct" "SS" looks ugly, silly or
    Karl> (as it is often the case) misleading. Especially in advertisings
    Karl> where decorative fonts are used, this is the case.

    Karl> Thus, there I see a strong need for a code LATIN UPPERCASE LETTER
    Karl> SHARP S (preferably in the Latin Extended-B section). Font designers
    Karl> then may decide if they take the glyph of the lowercase "ß", a glyph
    Karl> showing two copies of the "S" glyph, or (specially for decorative
    Karl> fonts) a new design, like a more bold or angular variant of the
    Karl> lowercase "ß".

    Karl> Also, the question is touched if an "official" orthography shall be
    Karl> normative or descriptive. If the answer tends to the latter, we have
    Karl> to look not only to the national standard bodies but also to look at
    Karl> the real use among the people.

I imagine the decomposition of this "upper" case character would be
U+0053 U+0053. Not to mention the problem of "SS" not being a single letter
in German.

Although the existence of "ß" feels like an imbalance in the symmetry of the
bicameral Latin letters, I see this problem as being similar to the case of
Georgian. There is no case, but they have "title" forms of the letters that
people sometimes mistake for upper case letters.

My answer would be to look at "real use" of "ß" as example and change it from
being a lower case letter to simply a letter with no case (from Ll to Lo in
UCDB terms). Its appearance then simply becomes a glyph variation, and we can
do anything we want with glyph variants.
-----------------------------------------------------------------------------
Mark Leisher
Computing Research Lab A truth is to be known always,
New Mexico State University to be uttered sometimes.
Box 30001, Dept. 3CRL -- Kahlil Gibran
Las Cruces, NM 88003



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT