Re: Ancient Northwest Semitic Script

From: Dean Snyder (
Date: Fri Dec 26 2003 - 17:44:08 EST

  • Next message: Patrick Andries: "Re: Aramaic unification and information retrieval"

    Michael Everson wrote at 2:57 PM on Friday, December 26, 2003:

    >At 02:23 -0500 2003-12-26, Dean Snyder wrote:
    >>What does chronological priority have to
    >>do with establishing separate encodings?
    >The source of scripts and characters has often been a criterion for
    >their disunification. Ages ago I showed that the unification of YOGH
    >and EZH was incorrect because the two letters had different sources.
    >The same is true for scripts.
    >To sketch the relationships: Canaanite split into Phoenician and
    >Aramaic. Paleo-Hebrew derives from Phoenician, as does Samaritan.
    >Square Hebrew on the other hand derives from Aramaic. There are nodes
    >on this tree which we are proposing to investigate for encoding.

    Sounds very similar to the development of the Latin script variants,
    doesn't it?

    >>Should Latin be separately encoded?
    >Latin *has* been separately encoded.

    Not the Latin that is comparable to the Phoenician we are talking about.

    I am using Latin the way you are using Phoenician, concretely and
    historically, not abstractly as it is used in Unicode where even the
    modern IPA character inventions are dubbed "LATIN" this and "LATIN" that.
    (My abstract name for the script of which Phoenician is a member is
    "Ancient Northwest Semitic".)

    Ancient Latin, as a parent script, is roughly analogous to the Phoenician
    under discussion. Ancient Latin does not have a J, U, or W in it, and yet
    Unicode, in the "Latin" block, has "LATIN CAPITAL LETTER J", etc. And so
    ancient Latin is NOT separately encoded; just as English, German,
    Croatian, Pinyin Latin, IPA, etc. are not separately encoded. The
    Latinate script variants are all UNIFIED; and most often bear names with
    the abstraction "LATIN" prepended, something you refuse to entertain for
    the ancient Northwest Semitic script, of which Phoenician is but one
    member. And this ancient script system has far more in common amongst its
    variant glyphic realizations than those subsumed under the rubric "LATIN"
    in Unicode.

    >>On the other hand, if you mean that both Hebrew and Phoenician are
    >>not glyphic variants of the same script system, then I know of no
    >>scholar who would agree with you.
    >Every historian of writing describes the various scripts *as*
    >scripts, and recognizes them differently.

    These are typically either paleographers, who are more interested in
    emphasizing glyphic variation than commonality, or they are script
    taxonomists intent on delineating lines of derivation and innovation. In
    neither case are they encoders, and in neither case do they use the word
    "script" with that meaning invested in it by Unicodists.

    Furthermore, I would venture to say that Unicode encoders met extensive,
    entrenched opposition by Chinese, Japanese, and Korean scholars in the
    effort to unify CJK, which makes it all the more striking that NOW it is
    the Unicodists who are resisting the unifiers of the ancient Northwest
    Semitic script while using similar cultural and historical rationalizations.

    >We have bilinguals where
    >people are distinguishing the scripts in text

    Show me one that is not a font issue - much like switching in and out of
    Fraktur type in modern German.

    >we have discussion,
    >for instance in the Babylonian Talmud, specifically discussing the
    >different writing systems as different.

    You need to cite these. I suspect these are not encoding level
    discussions but rather historical/cultural/paleographical discussions.

    >These scripts share a basic structure, sure.

    That is quite an understatement if you glanced at the glyph chart I
    attached to a previous email or if you support the unification of "Latin"
    characters in Unicode.

    >But Phoenician a glyph variant of Square Hebrew?
    >Certainly not.

    You are merely singling out end points from what I characterized as "a
    continuum of glyphic
    variation within a single script system".

    Again, if you separate out Phoenician, where will you stop? And on what
    bases are you making these distinctions?

    But, actually, I HAVE suggested that it might be a good idea to encode
    the ancient Northwest Semitic script, which, though it includes Old
    Hebrew, would not include Modern Hebrew.

    >>Ancient Phoenician, Punic, Hebrew, Moabite, Ammonite, and Aramaic are
    >>different dialects and/or languages commonly written with the same right-
    >>to-left script system
    >Again here you are using a "term", "script system" in an undefined way.
    >>containing the same 22 non-numeric characters and exhibiting no more
    >>glyphic variation over a period of a thousand years than that seen
    >>in the various manifestations of the Latin alphabet.
    >The same can be said for the Indic and Philippine and other scripts,
    >yet we (properly) encoded them. Some of the nodes on the tree show
    >enough variation to warrant separate encoding.

    But not the Phoenician, Punic, Moabite, Ammonite, Old Hebrew, and Old
    Aramaic nodes. In fact, the glyphic, or paleographic, variation is so
    slight at times between texts in these languages and dialects, that it is
    the extra-script evidence that is diagnostic for identification.

    >Research as to which
    >has not yet been completed apart from the initial work done in 1999
    >resulting in the current Roadmap.

    >>(For a sampling of ancient Phoenician, Moabite, and Hebrew glyphic
    >>variation see the attached script chart taken from Gibson's Textbook
    >>of Syrian Semitic Inscriptions - volume 2 has samples of Aramaic
    >>glyphic variants.)
    >There are many such charts; the resolution of the one you sent is not
    >sufficient to make use of it.
    >>I see no justification for separately encoding Phoenician.
    >Fine. I do, including but not limited to meta-discussion of writing
    >systems in a very large body of secondary literature.

    Can you point to ANY discussion in the secondary literature that concerns
    itself with the ancient Northwest Semitic "writing system", in the
    Unicode sense of that phrase, which, is, of course, what we are talking
    about here? What we have, I suggest, is a lot of paleographical and
    taxonomic literature, but no (?) encoding related literature.

    >>If you did encode it, where, and on what bases, then would you draw
    >>the lines for the separate encodings of the other ancient Northwest
    >>Semitic languages and periods (because that's what these are, other
    >>languages and periods, and not other scripts)?
    >This is the specific work we have not done yet, but it's not rocket

    You'll change your mind if and when you delve into it.

    But the problem is, you've already made up your mind beforehand - "There
    is zero chance that Phoenician will be considered to be a glyph variant
    of Hebrew. Zero chance.".

    >Students of writing are able to distinguish early Aramaic
    >from Phoenician because of certain characteristics in the ductus for

    Ductus is, of course, paleography, aka glyph variation.

    >Also there was the introduction of the matres lectionis.

    These are not new characters; these are the same old characters used in
    new, polyvalent ways. That does not a new script or encoding make.

    >>What we have here is a continuum of glyphic variation within a
    >>single script system.
    >Here we have a range of related but distinct scripts.

    What exactly are the script distinctives you have in mind here?

    >Compare Khutsuri (comprising Asomtavruli and Nuskhuri) and Mkhedruli

    If they're analogous in development and manifestation to the ancient
    Northwest Semitic script, on what bases were they encoded separately?
    Cultural bases? Political bases?

    >> >The number of books about writing systems, from children's books to books
    >> >for adults, which contain references to the Phoenician alphabet as the
    >> >parent to both Etruscan and Hebrew, are legion.
    >>Using the same reasoning, we should separately encode Latin, the
    >>parent script for English, German, French, Spanish, Italian, ...
    >You appear to have reasoned about this matter in a different way than
    >I have, for what you suggest would not follow from what I have

    You: Lots of sources talk about Phoenician script being the parent of
    Hebrew script.
    Me: Lots of sources talk about Latin script being the parent of English

    You: Therefore we need separate encodings for Phoenician and Hebrew.
    Me: Therefore we need separate encodings for Latin and English.

    How am I reasoning "in a different way" than you are?

    >> >Some scholars may decide to transliterate all Phoenician texts into
    >>>Hebrew script and read only that, and retrieve it from their
    >>>databases, and that is perfectly fine. Lots of people transliterate
    >>>Sanskrit into Latin and never use Devanagari.
    >>By definition, one cannot "transliterate ... Phoenician texts into
    >>Hebrew script".
    >Of course you can.

    Transliteration means substituting one set of different but "analogous"
    characters for another. If the characters are the SAME in both source and
    destination what possible meaning does the word "substitute" convey?
    Every character in Phoenician has its exact character equivalent in Old
    Hebrew. And fonts will take care of the desired display issues
    completely, just as they do for Fraktur, et al.

    >>Unlike your example of Devanagari and Latin, Phoenician and Hebrew
    >>share a common script system.
    >You can transliterate Devanagari Sanskrit into Sinhala and Burmese,
    >which scripts share the same structure. Latin shares a different
    >structure, it is true.
    >>I think the real problem here arises from the fact that medieval and
    >>modern Hebrew, a superset of the ancient Hebrew script, with vowels,
    >>punctuation, and cantillation marks added to late glyphic variants of the
    >>22 ancient Northwest Semitic consonants, was encoded in Unicode without
    >>considering Phoenician, Aramaic, etc. at the same time, and now there is
    >>resistance to using Unicode characters with "Hebrew" in their names to
    >>write Phoenician, Aramaic, etc.
    >I think the "real problem" here arises from the fact that some
    >scholars, familiar with Hebrew, find it easier to read early Semitic
    >texts in square script than in the originals.

    Well I, for one, prefer to read in more paleographically relevant
    renderings; and fonts combined with markup will, of course, take care of

    >The same thing happens
    >with Runic and Gothic and Glagolitic and Khutsuri, and indeed
    >Cuneiform, where Latin is often preferred (regardless of the
    >structure of the writing systems).

    "Transliteration" for cuneiform is entirely non-analogous because of this
    script's massive polyvalency, something not encountered at this kind or
    scale in ancient Northwest Semitic script. Cuneiformists use the word
    "transliteration" not for context-free, descriptive character
    substitution but for context-bound, interpretive syllabographic,
    ideographic, or taxographic substitution; in short, "transliteration" for
    cuneiformists represents how, out of a myriad of context-free
    possibilities, the "transliterator" UNDERSTANDS the text in its context
    should be read. This is NOT how you and I are using the term

    >The needs of those scholars is
    >met: they can use Hebrew and Latin with diacritics. No problem. The
    >needs of other clients of the Universal Character Set, no matter how
    >"unscholarly" they may be, will be met by encoding appropriate nodes
    >in the Semitic tree.

    So, you're NOT going to listen to the scholars in the field who want to
    unify and will be the most serious and main users of the encoding, but
    you WILL listen to the amateurs and hobbyists?! I'm not disparaging
    script amateurs and hobbyists at all, in fact I wish there were more of
    them (with money, to support script encoding initiatives :-)). But to
    reject the advice of the very community that provides them with most of
    their raw material seems like the tail wagging the dog, especially when
    categorical decisions have already been made while you admit that
    "Research ... has not yet been completed".

    We do not separately encode all the paleographic and epigraphic variants
    of the ancient Greek dialects; why should the almost completely analogous
    situation in the ancient Northwest Semitic script be any different?

    Like I said, I think the real reason for resistance is because we already
    have an encoding called "Hebrew" and for political, historical, cultural,
    religious, ethnic, or other reasons people do not want to use those
    "Hebrew" characters for Phoenician, Aramaic, etc. It's an artifact of
    what got encoded first, and the fact that it does not reflect the
    historical situation. But I don't believe the solution is simply to start
    encoding cultural and linguistic variants.


    Dean A. Snyder
    Scholarly Technology Specialist
    Library Digital Programs, Sheridan Libraries
    Garrett Room, MSE Library, 3400 N. Charles St.
    Johns Hopkins University
    Baltimore, Maryland, USA 21218

    office: 410 516-6850 fax: 410-516-6229
    Manager, Digital Hammurabi Project:

    This archive was generated by hypermail 2.1.5 : Fri Dec 26 2003 - 18:34:58 EST