From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Mon Jul 07 2003 - 08:52:08 EDT
On 06/07/2003 17:22, John Hudson wrote:
>
> Thanks for the thoughtful analysis, Peter. Eli Evans and I have been
> documenting all of the unique mark sequences in the Michigan-Claremont
> text and WTS morphology database that are potentially incorrectly
> re-ordered in Unicode normalisation (I say potentially, because the
> fixed position combining classes may, by chance, not reorder some
> combinations of vowels). In addition to the <patah, hiriq> and
> <qamats, hiriq> double vowel sequences for Yerushala(y)im, the example
> you cite from Exodes 20:4 involves two vowels with an interposed
> cantillation mark -- <qamata, etnahta, patah> -- which needs to be
> renderable both with and without the cantillation. The WTS morphology
> database also includes a <tsadi, sheva, hiriq> sequence (in 2 Ch
> 13:14, last word) that is not attested in either BHS or BHL; Peter
> Constable enquired about this, since it seemed that it might be an
> error, but the WTS editors assured him that it was intentional. ...
Thank you, John. Last year I did a similar analysis of the WTS database
(as released in 1998), well actually just a simple grep for the sequence
vowel - zero or more cantillation placeholders (^) - vowel, and found
only the 637 examples I mentioned. I missed the 2 Chronicles example,
perhaps because I didn't search for sheva followed by a vowel (though I
did include the reverse) as :A, :E and :F are the legal WTS encodings of
the hatef vowels (F=qamets). I just did that grep, for ":\^*[IOU]", and
found only the 2 Ch 13:14 example. Well, I must say that that one looks
very like an error in the WTS database, *MAX:ACOC:IRYM should be
*MAX:ACOC:RIYM. As this is marked with * as a rendering of the Ketiv, it
is odd to give it vowels at all, and very odd to give a unique
combination of vowels. But there may be something strange in the actual
MS here that I don't know about.
>
>
> ... Given the small number of attested sequences that would be
> adversely affected by normalisation re-ordering, I'm beginning to
> favour the idea of encoding these sequences as individual characters.
> We'd probably only need three or four, plus a right meteg, to solve
> the problem, and rendering would work find with existing font and
> layout engine technologies.
This sounds like a sensible alternative.
-- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/
This archive was generated by hypermail 2.1.5 : Mon Jul 07 2003 - 09:57:34 EDT