Transliterating ancient scripts [was: ASCII and Unicode lifespan]

From: Dean Snyder (dean.snyder@jhu.edu)
Date: Sat May 21 2005 - 13:51:12 CDT

  • Next message: Nick Nicholas: "Transliterating ancient scripts [was: ASCII and Unicode lifespan]"

    Nick Nicholas wrote at 2:00 AM on Sunday, May 22, 2005:

    >from Dean Snyder:
    >
    >> The truth is, most cuneiformists do not see any need for Unicode
    >> cuneiform; they are happy to continue in transliteration. But some
    >> very
    >> important cuneiformists DO see the need and when the rest see the
    >> programmatic tools we are developing for cuneiform in Unicode the
    >> prevailing laissez-faire attitude will change rather rapidly.
    >
    >Like I say on http://www.tlg.uci.edu/~opoudjis/unicode/
    >unicode_epichorica.html : "Don't Proliferate, Transliterate".

    Encoding existing scripts is not proliferation.

    Transliteration is lossy.

    And I notice you did not quote or respond to my concluding, pertinent-to-
    your-comments, remark: "Cuneiform in transliteration is like Japanese in
    transliteration, with all the same advantages and disadvantages."

    If you want to oppose encoding scripts you will have to deal with such
    observations.

    >(As
    >Patrick just said, and Carl-Martin Bunz insisted in Unicode tech note
    >3). Unicode may contain a whole heap of archaic scripts, but that
    >will not change the fact that old texts will overwhelmingly continue
    >to be published and discussed in transliteration

    Not a little due to inadequate technology.

    But contra, see Syriac, Greek, Hebrew, Old Chinese ...

    > --- both for
    >practical or political reasons ( http://www.tlg.uci.edu/~opoudjis/
    >unicode/unicode_epichorica.html#target has some ruminations on this,
    >which I owe to recent discussion with John Cowan),

    You lost your case when you completely misrepresented the arguments used
    against encoding Phoenician:

    "In the same way, Semiticists treated Phoenician and its ilk as part of
    the Hebrew patrimony, and so transliterated it into Hebrew (as a furious
    thread on Unicode List through much of 2004 brought forth --- which is
    why Semiticists do not see any point in encoding Phoenecian separately
    from Hebrew)."

    That's worse than saying that Latin is transliterated into English.
    Phoenician is not transliterated into Hebrew; they are the same script.

    And Hebrew patrimony IS NOT the reason why some of us do not want
    Phoenician encoded, in fact the many contra arguments do not even
    include this "reason".

    [And no, I refuse to be drawn into a new discussion of Phoenician
    encoding. I only mention this because it indicates faulty analysis of
    one transliteration issue, which I believe influences your remarks on
    cuneiform.]

    >and because of the
    >real problem with normalising an insufficiently known glyph
    >repertoire. Plus, of course, tradition matters a lot. Dean, you know
    >your community better than I do; but I think you're being optimistic.
    >(And I know that if I ever need to cite Sumerian in a paper, the
    >transliteration's what I'll use.)

    Anyone who does original research in cuneiform knows the pedagogical
    value of glyph reinforcement, the analytical value of glyph interaction,
    the value of programmatic script detection for text processing, and the
    importance of both glyph-based restorations and glyph-based error
    detection. Encoded scripts more closely model autograph text and
    therefore either enable or greatly improve the execution of these
    activities (without, of course, replacing the need for the autopsy of
    original texts).

    Transliteration, however, as the term is used in cuneiform studies, will
    definitely continue to have its uses - primarily:

     * in bringing texts in lesser known scripts to non-specialist audiences
     * and in recording interpretations of text in polyvalent scripts.

    >...
    >Finally, Michael Everson [...] dismisses the character-glyph model thus:
    >
    >> Because, Patrick, the character-glyph model is not as rigid and
    >> rule-bound as you would like it to be. Consider the many hundreds if
    >> not thousands of Han characters which are clearly duplicates,
    >> variants, or just plain unknown.
    >
    >But those Han duplicates etc. are not there because Unicode wished
    >them there; they are legacy cruft, saddled both by preexisting
    >encodings and by the cultural weight of CJK lexicography. Where
    >Unicode is considering encodings ab initio, with no such cultural or
    >legacy static, it should take its own rules seriously. The Phaistos
    >Disk (or board game) is not the Han character repertoire. After all,
    >just because we got saddled with oodles of precomposed codepoints
    >through legacy doesn't mean we should dismiss the avoidance of new
    >precomposed codepoints for being "rigid and rule-bound"; the case
    >looks to me fully analogous.

    Excellent argument.

    Respectfully,

    Dean A. Snyder

    Assistant Research Scholar
    Manager, Digital Hammurabi Project
    Computer Science Department
    Whiting School of Engineering
    218C New Engineering Building
    3400 North Charles Street
    Johns Hopkins University
    Baltimore, Maryland, USA 21218

    office: 410 516-6850
    cell: 717 817-4897
    www.jhu.edu/digitalhammurabi/
    http://users.adelphia.net/~deansnyder/



    This archive was generated by hypermail 2.1.5 : Sat May 21 2005 - 19:59:18 CDT