From: John Hudson (tiro@tiro.com)
Date: Mon May 24 2004 - 18:46:29 CDT
Peter Constable wrote:
> I was not involved in those discussions so cannot comment on them. I
> just wish to point out that the MCW representation of Hebrew most
> certain *is* supported in Unicode: MCW uses ASCII Latin letters and
> punctuation characters to stand for Hebrew letters, vowel points and
> accents, and those exact same ASCII characters are encoded in Unicode.
This was an 8-bit hack, the point which Elaine and other Biblical Hebrew scholars make is
that MCW explicitly encodes distinctions between some marks, based on positioning, that
the Unicode Hebrew block unifies. This means that while MCW text can be easily converted
to Unicode Hebrew, it is not possible to round-trip such conversion in the same way that
Unicode provides for pre-existing 8-bit standard character sets. One of the unfortunate
aspects of this is that the ASCII-hack MCW encoding will likely remain the source encoding
for many electronic Biblical Hebrew texts for some time to come, even if published texts
are re-encoded as Unicode Hebrew, since MCW permits simple and unambiguous plain-text
encoding of distinctions that are important to textual analysis. For example, although my
clients at Libronic use Unicode encoding for their electronic BHS edition (because it
provides greater interchangeability), they maintain an MCW encoded text as their master
source. So much for the 'universal' character set...
John Hudson
-- Tiro Typeworks www.tiro.com Vancouver, BC tiro@tiro.com Currently reading: Typespaces, by Peter Burnhill White Mughals, by William Dalrymple Hebrew manuscripts of the Middle Ages, by Colette Sirat
This archive was generated by hypermail 2.1.5 : Mon May 24 2004 - 18:47:11 CDT