Re: Yerushala(y)im - or Biblical Hebrew

From: Peter Kirk (
Date: Mon Jul 07 2003 - 08:52:08 EDT

  • Next message: Karl Pentzlin: "Re: 24th Unicode Conference - Atlanta, GA - September 3-5, 2003 [OT]"

    On 06/07/2003 17:22, John Hudson wrote:

    > Thanks for the thoughtful analysis, Peter. Eli Evans and I have been
    > documenting all of the unique mark sequences in the Michigan-Claremont
    > text and WTS morphology database that are potentially incorrectly
    > re-ordered in Unicode normalisation (I say potentially, because the
    > fixed position combining classes may, by chance, not reorder some
    > combinations of vowels). In addition to the <patah, hiriq> and
    > <qamats, hiriq> double vowel sequences for Yerushala(y)im, the example
    > you cite from Exodes 20:4 involves two vowels with an interposed
    > cantillation mark -- <qamata, etnahta, patah> -- which needs to be
    > renderable both with and without the cantillation. The WTS morphology
    > database also includes a <tsadi, sheva, hiriq> sequence (in 2 Ch
    > 13:14, last word) that is not attested in either BHS or BHL; Peter
    > Constable enquired about this, since it seemed that it might be an
    > error, but the WTS editors assured him that it was intentional. ...

    Thank you, John. Last year I did a similar analysis of the WTS database
    (as released in 1998), well actually just a simple grep for the sequence
    vowel - zero or more cantillation placeholders (^) - vowel, and found
    only the 637 examples I mentioned. I missed the 2 Chronicles example,
    perhaps because I didn't search for sheva followed by a vowel (though I
    did include the reverse) as :A, :E and :F are the legal WTS encodings of
    the hatef vowels (F=qamets). I just did that grep, for ":\^*[IOU]", and
    found only the 2 Ch 13:14 example. Well, I must say that that one looks
    very like an error in the WTS database, *MAX:ACOC:IRYM should be
    *MAX:ACOC:RIYM. As this is marked with * as a rendering of the Ketiv, it
    is odd to give it vowels at all, and very odd to give a unique
    combination of vowels. But there may be something strange in the actual
    MS here that I don't know about.

    > ... Given the small number of attested sequences that would be
    > adversely affected by normalisation re-ordering, I'm beginning to
    > favour the idea of encoding these sequences as individual characters.
    > We'd probably only need three or four, plus a right meteg, to solve
    > the problem, and rendering would work find with existing font and
    > layout engine technologies.

    This sounds like a sensible alternative.

    Peter Kirk

    This archive was generated by hypermail 2.1.5 : Mon Jul 07 2003 - 09:57:34 EDT