From: Dean Snyder (firstname.lastname@example.org)
Date: Fri Dec 26 2003 - 17:44:08 EST
Michael Everson wrote at 2:57 PM on Friday, December 26, 2003:
>At 02:23 -0500 2003-12-26, Dean Snyder wrote:
>>What does chronological priority have to
>>do with establishing separate encodings?
>The source of scripts and characters has often been a criterion for
>their disunification. Ages ago I showed that the unification of YOGH
>and EZH was incorrect because the two letters had different sources.
>The same is true for scripts.
>To sketch the relationships: Canaanite split into Phoenician and
>Aramaic. Paleo-Hebrew derives from Phoenician, as does Samaritan.
>Square Hebrew on the other hand derives from Aramaic. There are nodes
>on this tree which we are proposing to investigate for encoding.
Sounds very similar to the development of the Latin script variants,
>>Should Latin be separately encoded?
>Latin *has* been separately encoded.
Not the Latin that is comparable to the Phoenician we are talking about.
I am using Latin the way you are using Phoenician, concretely and
historically, not abstractly as it is used in Unicode where even the
modern IPA character inventions are dubbed "LATIN" this and "LATIN" that.
(My abstract name for the script of which Phoenician is a member is
"Ancient Northwest Semitic".)
Ancient Latin, as a parent script, is roughly analogous to the Phoenician
under discussion. Ancient Latin does not have a J, U, or W in it, and yet
Unicode, in the "Latin" block, has "LATIN CAPITAL LETTER J", etc. And so
ancient Latin is NOT separately encoded; just as English, German,
Croatian, Pinyin Latin, IPA, etc. are not separately encoded. The
Latinate script variants are all UNIFIED; and most often bear names with
the abstraction "LATIN" prepended, something you refuse to entertain for
the ancient Northwest Semitic script, of which Phoenician is but one
member. And this ancient script system has far more in common amongst its
variant glyphic realizations than those subsumed under the rubric "LATIN"
>>On the other hand, if you mean that both Hebrew and Phoenician are
>>not glyphic variants of the same script system, then I know of no
>>scholar who would agree with you.
>Every historian of writing describes the various scripts *as*
>scripts, and recognizes them differently.
These are typically either paleographers, who are more interested in
emphasizing glyphic variation than commonality, or they are script
taxonomists intent on delineating lines of derivation and innovation. In
neither case are they encoders, and in neither case do they use the word
"script" with that meaning invested in it by Unicodists.
Furthermore, I would venture to say that Unicode encoders met extensive,
entrenched opposition by Chinese, Japanese, and Korean scholars in the
effort to unify CJK, which makes it all the more striking that NOW it is
the Unicodists who are resisting the unifiers of the ancient Northwest
Semitic script while using similar cultural and historical rationalizations.
>We have bilinguals where
>people are distinguishing the scripts in text
Show me one that is not a font issue - much like switching in and out of
Fraktur type in modern German.
>we have discussion,
>for instance in the Babylonian Talmud, specifically discussing the
>different writing systems as different.
You need to cite these. I suspect these are not encoding level
discussions but rather historical/cultural/paleographical discussions.
>These scripts share a basic structure, sure.
That is quite an understatement if you glanced at the glyph chart I
attached to a previous email or if you support the unification of "Latin"
characters in Unicode.
>But Phoenician a glyph variant of Square Hebrew?
You are merely singling out end points from what I characterized as "a
continuum of glyphic
variation within a single script system".
Again, if you separate out Phoenician, where will you stop? And on what
bases are you making these distinctions?
But, actually, I HAVE suggested that it might be a good idea to encode
the ancient Northwest Semitic script, which, though it includes Old
Hebrew, would not include Modern Hebrew.
>>Ancient Phoenician, Punic, Hebrew, Moabite, Ammonite, and Aramaic are
>>different dialects and/or languages commonly written with the same right-
>>to-left script system
>Again here you are using a "term", "script system" in an undefined way.
>>containing the same 22 non-numeric characters and exhibiting no more
>>glyphic variation over a period of a thousand years than that seen
>>in the various manifestations of the Latin alphabet.
>The same can be said for the Indic and Philippine and other scripts,
>yet we (properly) encoded them. Some of the nodes on the tree show
>enough variation to warrant separate encoding.
But not the Phoenician, Punic, Moabite, Ammonite, Old Hebrew, and Old
Aramaic nodes. In fact, the glyphic, or paleographic, variation is so
slight at times between texts in these languages and dialects, that it is
the extra-script evidence that is diagnostic for identification.
>Research as to which
>has not yet been completed apart from the initial work done in 1999
>resulting in the current Roadmap.
>>(For a sampling of ancient Phoenician, Moabite, and Hebrew glyphic
>>variation see the attached script chart taken from Gibson's Textbook
>>of Syrian Semitic Inscriptions - volume 2 has samples of Aramaic
>There are many such charts; the resolution of the one you sent is not
>sufficient to make use of it.
>>I see no justification for separately encoding Phoenician.
>Fine. I do, including but not limited to meta-discussion of writing
>systems in a very large body of secondary literature.
Can you point to ANY discussion in the secondary literature that concerns
itself with the ancient Northwest Semitic "writing system", in the
Unicode sense of that phrase, which, is, of course, what we are talking
about here? What we have, I suggest, is a lot of paleographical and
taxonomic literature, but no (?) encoding related literature.
>>If you did encode it, where, and on what bases, then would you draw
>>the lines for the separate encodings of the other ancient Northwest
>>Semitic languages and periods (because that's what these are, other
>>languages and periods, and not other scripts)?
>This is the specific work we have not done yet, but it's not rocket
You'll change your mind if and when you delve into it.
But the problem is, you've already made up your mind beforehand - "There
is zero chance that Phoenician will be considered to be a glyph variant
of Hebrew. Zero chance.".
>Students of writing are able to distinguish early Aramaic
>from Phoenician because of certain characteristics in the ductus for
Ductus is, of course, paleography, aka glyph variation.
>Also there was the introduction of the matres lectionis.
These are not new characters; these are the same old characters used in
new, polyvalent ways. That does not a new script or encoding make.
>>What we have here is a continuum of glyphic variation within a
>>single script system.
>Here we have a range of related but distinct scripts.
What exactly are the script distinctives you have in mind here?
>Compare Khutsuri (comprising Asomtavruli and Nuskhuri) and Mkhedruli
If they're analogous in development and manifestation to the ancient
Northwest Semitic script, on what bases were they encoded separately?
Cultural bases? Political bases?
>> >The number of books about writing systems, from children's books to books
>> >for adults, which contain references to the Phoenician alphabet as the
>> >parent to both Etruscan and Hebrew, are legion.
>>Using the same reasoning, we should separately encode Latin, the
>>parent script for English, German, French, Spanish, Italian, ...
>You appear to have reasoned about this matter in a different way than
>I have, for what you suggest would not follow from what I have
You: Lots of sources talk about Phoenician script being the parent of
Me: Lots of sources talk about Latin script being the parent of English
You: Therefore we need separate encodings for Phoenician and Hebrew.
Me: Therefore we need separate encodings for Latin and English.
How am I reasoning "in a different way" than you are?
>> >Some scholars may decide to transliterate all Phoenician texts into
>>>Hebrew script and read only that, and retrieve it from their
>>>databases, and that is perfectly fine. Lots of people transliterate
>>>Sanskrit into Latin and never use Devanagari.
>>By definition, one cannot "transliterate ... Phoenician texts into
>Of course you can.
Transliteration means substituting one set of different but "analogous"
characters for another. If the characters are the SAME in both source and
destination what possible meaning does the word "substitute" convey?
Every character in Phoenician has its exact character equivalent in Old
Hebrew. And fonts will take care of the desired display issues
completely, just as they do for Fraktur, et al.
>>Unlike your example of Devanagari and Latin, Phoenician and Hebrew
>>share a common script system.
>You can transliterate Devanagari Sanskrit into Sinhala and Burmese,
>which scripts share the same structure. Latin shares a different
>structure, it is true.
>>I think the real problem here arises from the fact that medieval and
>>modern Hebrew, a superset of the ancient Hebrew script, with vowels,
>>punctuation, and cantillation marks added to late glyphic variants of the
>>22 ancient Northwest Semitic consonants, was encoded in Unicode without
>>considering Phoenician, Aramaic, etc. at the same time, and now there is
>>resistance to using Unicode characters with "Hebrew" in their names to
>>write Phoenician, Aramaic, etc.
>I think the "real problem" here arises from the fact that some
>scholars, familiar with Hebrew, find it easier to read early Semitic
>texts in square script than in the originals.
Well I, for one, prefer to read in more paleographically relevant
renderings; and fonts combined with markup will, of course, take care of
>The same thing happens
>with Runic and Gothic and Glagolitic and Khutsuri, and indeed
>Cuneiform, where Latin is often preferred (regardless of the
>structure of the writing systems).
"Transliteration" for cuneiform is entirely non-analogous because of this
script's massive polyvalency, something not encountered at this kind or
scale in ancient Northwest Semitic script. Cuneiformists use the word
"transliteration" not for context-free, descriptive character
substitution but for context-bound, interpretive syllabographic,
ideographic, or taxographic substitution; in short, "transliteration" for
cuneiformists represents how, out of a myriad of context-free
possibilities, the "transliterator" UNDERSTANDS the text in its context
should be read. This is NOT how you and I are using the term
>The needs of those scholars is
>met: they can use Hebrew and Latin with diacritics. No problem. The
>needs of other clients of the Universal Character Set, no matter how
>"unscholarly" they may be, will be met by encoding appropriate nodes
>in the Semitic tree.
So, you're NOT going to listen to the scholars in the field who want to
unify and will be the most serious and main users of the encoding, but
you WILL listen to the amateurs and hobbyists?! I'm not disparaging
script amateurs and hobbyists at all, in fact I wish there were more of
them (with money, to support script encoding initiatives :-)). But to
reject the advice of the very community that provides them with most of
their raw material seems like the tail wagging the dog, especially when
categorical decisions have already been made while you admit that
"Research ... has not yet been completed".
We do not separately encode all the paleographic and epigraphic variants
of the ancient Greek dialects; why should the almost completely analogous
situation in the ancient Northwest Semitic script be any different?
Like I said, I think the real reason for resistance is because we already
have an encoding called "Hebrew" and for political, historical, cultural,
religious, ethnic, or other reasons people do not want to use those
"Hebrew" characters for Phoenician, Aramaic, etc. It's an artifact of
what got encoded first, and the fact that it does not reflect the
historical situation. But I don't believe the solution is simply to start
encoding cultural and linguistic variants.
Dean A. Snyder
Scholarly Technology Specialist
Library Digital Programs, Sheridan Libraries
Garrett Room, MSE Library, 3400 N. Charles St.
Johns Hopkins University
Baltimore, Maryland, USA 21218
office: 410 516-6850 fax: 410-516-6229
Manager, Digital Hammurabi Project: www.jhu.edu/digitalhammurabi
This archive was generated by hypermail 2.1.5 : Fri Dec 26 2003 - 18:34:58 EST