Re: Codepoint Differentiation

From: UList@dfa-mail.com
Date: Tue Feb 22 2005 - 04:42:19 CST

  • Next message: Peter Kirk: "Re: nameprep, IDN spoofing and the registries"

    Hi Asmus,

    Thank you for your constructive criticism and debate.

    SERBIAN T

    > This should be handled by language dependent glyph selection.
    > That's a standard feature in OpenType and there's no need to
    > duplicate that facility in the encoding.

    As I've already mentioned, switching language identifiers back and forth every
    word in an HTML Russian-Serbian dictionary is hardly an efficient solution.

    And this is not a job for language identifiers anyway -- at least
    that'sUnicode's opinion (??). That is to say, all the other language-specific
    Cyrillic letter variants have gotten codepoints. The problem with this one
    isn't that it's Serbian, but that it's italic.

    Not quite eligible for a codepoint, not worth switching entire language
    identifiers over one letter. Perfect candidate to be defined as an (optionally
    ignorable) *sub-category* of the main Cyrillic "t".

    And *such* a ridiculously simple thing to do, to say, OK, you can add
    Variation Selector 1 -- with no negative impact on anyone.

    COPTIC -- MEA CULPA

    > Already being encoded in 4.1 in a new "Coptic" block. The unification of

    You're right. I had read the new documentation quickly, and thought it said
    the new Coptic block only adds additional characters, and the Greek letters
    are still shared. (It's the non-Greek Coptic letters from the Greek and Coptic
    block that will still be in use.)

    So Coptic can be scratched from my list. Sorry to distract everyone with that.

    ARCHAIC GREEK ALPHABETS

    > Rather than messing with variation selectors, this is best handled by using
    > fonts that are specific to archaic use.

    If we are telling users to download and install special fonts to view a Web
    page, we are back to the 1995 multilingual Internet. We might as well have
    Latin, Cyrillic, and Greek all assigned to the same codepoints, and switch
    fonts for those as well (as we used to do, before this thing called Unicode).

    Archaic Greek alphabets (and some similar alphabets from Asia Minor) have a
    greater identity than just "glyph variants". They were in use for hundreds of
    years by incredibly important civilizations.

    They are as visually distinct from standard Greek script as Latin is -- a
    point that can be easily proven by the fact that "Latin script" is actually
    just one of those Archaic Greek alphabets (West Greek), with a few minor alterations.

    And they have more material than Ogham or Runic -- possibly more material in
    the one long Gortyn Law Code inscription from Crete than the entire corpus of Ogham.

    So it can well be argued that they deserve codepoints. In fact I've been told
    offline that this may be in the pipeline, though I don't find any info on the
    Unicode site.

    But as I will establish in an upcoming separate analysis, there are actually a
    number of advantages to encoding these scripts "parallel" to the basic Greek
    codepoints -- especially if as expectably -- any Archaic Greek (or Archaic
    Aegean Alphabetic) block would be a single jumble of glyph variants, rather
    than a set of separate scripts.

    Thank you all for your continuing open minds regarding these matters. I'm
    going to keep working on explaining all this for awhile, and you may just find
    yourselves *eventually* saying... 'Oh, wait -- now I get it! It works! It's
    simpler! It's easier! It's better for us!" :)

    Doug

    Asmus Freytag wrote:
    >
    > At 04:44 PM 2/21/2005, UList@dfa-mail.com wrote:
    > >Hello,
    > >
    > >
    > >1. Then it sound like:
    > >
    > > - Serbian Cyrillic Small "t"
    >
    > This should be handled by language dependent glyph selection.
    > That's a standard feature in OpenType and there's no need to
    > duplicate that facility in the encoding.
    >
    > (Unless I misunderstand this example).
    >
    > > - Coptic letterforms for Greek letter codepoints
    >
    > Already being encoded in 4.1 in a new "Coptic" block. The unification of
    > these has been considered a mistake - it took a while to rectify as we
    > needed to research what precisely the Coptic repertoire should be.
    >
    > >- complete Archaic Greek and Asia Minor scripts aligned to Greek letter
    > >codepoints
    >
    > Rather than messing with variation selectors, this is best handled by using
    > fonts that are specific to archaic use.
    >
    > Where it's a question of a a different script - be patient, it's probably
    > slated to be encoded.
    >
    > It's a common problem that archaic scripts use different shapes at
    > different times for the same characters. Sometimes, the answer may be that
    > it's really two different scripts, in which case the precursor can be coded
    > separately. Sometimes, it's reasonable to ask users to use a different font
    > for a given period. Sometimes, a specific higher level protocol should be
    > developed to handle specific problems of scholarly representation of text.
    >
    > As a last resort, variation selectors might be used in some instances - but
    > not as a blanket approach.
    >
    > >are exactly what the Variation Selectors were designed for. There are no
    > >issues other than a smart font substituting an alternate glyph. They can
    > >default in a "low-fidelity" rendering to the primary codepoint glyphs. I would
    > >welcome individual codepoints for them, but Unicode has already decided
    > >otherwise. There is a clear need to be able to access the glyphs somehow, on a
    > >device which has only one general Unicode font installed.
    >
    > As stated above (and as others have pointed out) your premise is incorrect
    > for many of your examples. Not everything that requires glyph substitution
    > should be encoded via variation selectors.
    >
    > A./



    This archive was generated by hypermail 2.1.5 : Tue Feb 22 2005 - 04:28:47 CST