Re: Aramaic unification and information retrieval

From: Christopher John Fynn (cfynn@gmx.net)
Date: Fri Dec 26 2003 - 20:28:54 EST

  • Next message: jameskass@att.net: "Re: Ancient Northwest Semitic Script"

    "Michael Everson" <everson@evertype.com> wrote:
     "Unicode List" <unicode@unicode.org>

    > At 17:46 +0000 2003-12-26, Christopher John Fynn wrote:

    > >(Though the Roman style & Fraktur style of Latin script are probably more
    > >different from each other as some of the separately encoded Indic
    > >scripts [e.g. Kannada / Telugu])

    > Sorry, Chris, this is unsubstantiated speculation, and it doesn't
    > happen to be true.

    Well *that* part is true. Anyone I've ever talked to either says that Kannada
    script is based on Telugu script; Telugu is based on Kannada - or that Telugu
    & Kannada originally shared a single "Telugu-Kannada" script which slowly
    diverged into two forms
    cf http://brahmi.sourceforge.net/docs/KannadaComputing.html

    Whichever came first Telugu & Kannada are certainly as close as Roman & Fraktur
     and anyone who reads the Telugu alphabet can read the Kannada alphabet easier
    than I can distinguish the different letters in Guttenberg's 42 line Bible.

    I'm wondering, is there a principle that can be applied if someone say proposes
    the old Telugu-Kannada script? Do we say "use Telugu", "use Kannada" or accept
    it as a separate script? And what would be the basis of such a decision?

    > In 1997, I showed some comparisons between Coptic, Greek, Cyrillic,
    > and Gothic showing that all of them but Greek were similar enough to
    > be read with a minimum of training and practice. I revised this a bit
    > in 2001: http://www.evertype.com/standards/cy/coptic.html. German,
    > English, and Irish can all be read with similarly low learning curve
    > whether the script is Fraktur or Gaelic; the number of letterforms
    > which differ is small. Wedding invitations in English-speaking
    > countries are routinely written in non-Latin garb. the identification
    > is uncontested!

    Some black letter Gothic styles of writing have numerous ligatures. The
    component letters of some of these ligatures are not that obvious if you don't
    read the language being written or have someone to tells you what they are. Of
    course if you read the language that is being written, you quickly work out
    most from context.

    Anyway does recognition proves very much? In my experience many Indians
    with no knowledge of Tibetan can recognise Tibetan letters - even parts of
    conjuncts - although Tibetan is quite different .from any modern Indian script
    and, since it is used to write a language which is not Indic, the conjunct
    set is quite different.

    > No student of writing systems classes the "Gaelic
    > script" as something different from "Latin script".

    No - but surely those who originally wrote Gaelic used the same style of script
    to write Latin?

    > The same cannot
    > be said of Phoenician, Samaritan, and Hebrew, for instance.

    > >So in the case of the ancient Semitic scripts - even if they are closely
    > >related, is each associated with a particular written language - or were
    the
    > >different but related scripts being used to write a common language?

    > All of them can be used to write more than one language. Some of them
    > may not have been. It's complex and needs review.

    It's obviously complicated. I'm just wondering if a loose rule of thumb can be
    worked out which can be applied to give an idea when a script should be encoded
    separately, when it should not, or when it is a case where there are
    reasonable
    arguments for and against separate encoding that need weighing up?

    Or is such a rule going to end up something like the description of a language
    as "a dialect
    that issues postage stamps" or "a dialect that has an army"?

    My own unscientific gut instinct is to be sympathetic to encoding "dead"
    ancient scripts separately even when they are related since valuable historic
    information may be conveyed simply by the fact a manuscript is written in one
    script or another. That information, which granted may be of more importance
    to palaeographers and epigraphers than to philologists, is no longer so
    apparent when that document is encoded in another script.

    When things are as historically remote as these scripts are there may be a
    danger of too easily to telescoping them together. If these Semitic scripts
    were in current usage would anyone seriously consider giving them a unified
    encoding?

    Perhaps now that a number of historic scripts with no immediate
    contemporary equivalents are candidates for encoding it is as
    opportune time as any to thrash out these things and establish
    some general principles?

    regards

    - Chris



    This archive was generated by hypermail 2.1.5 : Fri Dec 26 2003 - 23:06:36 EST