Re: Hebrew: glyphs vs. codepoints

From: Arno Schmitt (arno@zedat.fu-berlin.de)
Date: Mon May 31 1999 - 01:47:38 EDT


Jonathan Rosenne:
> There aren't two Holams. These are glyphs, not characters.

The difference between "matsot" (loafs of unleavened bread)
and "mitswot" (obligations)
is not a (typo)graphical difference:
in "matsot" the holam stands to the right of the waw,
thus making the waw into a mater lectionis, mere "carrier" of the
waw (waw plus right holam = [o]),
in "mitswot" the holam stands to the left of waw, here the waw is
a normal consonant (waw plus left holam = [wo].
To say say apodictically: "There aren't two Holams." is not
enough!

With alef we have the same two possibilities:
in "rosh" and "bo" the holam sits on the right of the "carrier"
alef (right holam on alef = [o])
in "bo'i" the holam sits on the left of the be because the alef
here is the consonant glottal stop giving [o'],
in "'oax" and "'otem" (with tet) we have the left holam on alef:
['o]
 
Jony wrote:
> it is not the job of international standards to improve local traditions.

I do not propose that you change the way Hebrew is written or
printed,
I just point out that on computers there are more intelligent
input methods than on the typewriter, and that handling of text is
easier when in "tsarix" and "tsrixim" the "x" has the same
codepoint. "for most of our purposes, this greatly simplifies
writing code to process the text." as Mark Leisher put it.

May I remind you of what John Cowan quoted and wrote earlier:
# Variant forms of five Hebrew letters are encoded as separate
# characters in all Hebrew standards; therefore this practice
# is followed in the Unicode Standard. These five variant
# forms are encoded in this block rather than the compatibility
# zone in order to retain structural consistency between this
# block and ISO 8859-8.

JC: This tends to indicate, IMHO, that the editors of the Unicode

JC: Standard did not view the final forms as anything but
compatibility

JC: characters, preserved only to make roundtripping with 8859-8

JC: and other 8-bit standards easy.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT