Spacing diacritics in Greek Extended

From: Nick NICHOLAS (nicholas@uci.edu)
Date: Wed Feb 28 2001 - 18:57:10 EST


As you know, in the short term any texts out there in Unicode
polytonic Greek use precomposed characters, as people are not waiting for
the intelligent font engines of the future. To put texts in Unicode, they
convert them from existing codings. In all of these existing codings, be
they 8-bit or ASCII-based (Beta Code), a capital letter with diacritics
(titlecase) is rendered as two glyphs: the diacritics, as a spacing glyph,
and then the capital.

Since people have no familiarity with single-glyph
capitals-with-diacritics, they are doing the same with their precomposed
Unicode glyphs, using the spacing diacritics at the bottom of Greek
Extended. See for example
http://www.fordham.edu/halsall/basis/thomais-uni.html : the diacritics in
section 5, at least, are separate glyphs.

Unicode allows these spacing diacritic glyphs, but the Standard says that
"unless information is present to the contrary", they should be
interpreted as SPACE + non-spacing equivalent diacritic (Unicode 3.0,
p.169-170). Would it be expedient to change this to having it postmodify the
next character, as a legitimate legacy concern (which is why the
precomposeds are there in the first place?)

Fortunately the main online resource for converting into Unicode
polytonic Greek (Sean Redmond's,
http://www.jiffycomp.com/smr/unicode/convert.php3) is well-behaved in
this regard.

-- 
Nick Nicholas. TLG, UCI, USA. nicholas@uci.edu; www.tlg.uci.edu/~opoudjis
 Many among their proselytes had sold their lands and houses to increase
  the public riches of the sect --- at the expense, indeed, of their
  unfortunate children, who found themselves beggars because their
  parents had been saints. (Edward Gibbon, _Decline and Fall_.)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT