Unicoders, some you may already be familiar with Yannis Haralambous' paper
http://omega.enstb.org/yannis/pdf/amendments2.pdf , given at Dublin. 
The issues of precomposition have been hashed over again and again, 
so I won't add to them; but I'd like to register some concern (and 
some agreement) with other bits of it. In the following, of course, I 
speak for myself, dude with goatee in Australia --- not for any of 
employers past or present, or for the Consortium.
The first two 'rules' Haralambous are peculiar, because they are 
already taken care of by normalisation. Furthermore, while users can 
and should be steered away from certain diacritic combinations, 
Unicode can't be the body saying what characters you can combine with 
what --- since you might not want to write coherent Greek at all (a 
point Rick has made to inquiries from the Greek Unicode list, and Ken 
(I think) to me on a previous occasion.) So it needs to be made clear 
that these are rules for users using the characters to produce cogent 
Greek --- rules which Unicode, as I understand, doesn't want to 
police at a design level.
To reiterate stuff that's been said before:
1.1. The acute and the tonos are the same thing; this is known. They 
weren't the same thing necessarily from 1982-1986, whence the 
confusion at ELOT. But normalisation takes care of this well. (The 
snarking at 2.1.4 is unnecessary: people with lots of knowledge of 
Greek were using a tonos distinct from the acute in 1982 --- 
including the notorious Prof. Kriaras --- and the confusion is 
likelier to be heritage from ELOT than Unicode's fault. Admittedly, 
the vertical dash that used to feature on the charts is not familiar 
to me as having been used in 1982...)
1.2. An uppercase letter can carry accents without breathings in 
older typographical traditions of Greek; but of course, those are 
instances in the middle of a word, because all-caps words used to be 
accented --- and their accents were above the letter, not to the 
left. The accents to the left occur only in the initials of title 
case words, and of course always require a breathing in polytonic. 
What on earth the function of U+1FBA is (A with initial grave) is a 
mystery for the ages; I assume ELOT just got confused, giving 
pseudo-polytonic equivalents of capitals plus tonos. But of course, 
those characters will now never go away. Hopefully anyone designing a 
font to deal with 17th century Greek will realise that those 
characters shouldn't be used for all-caps accents --- and the current 
glyphs are certainly going to discourage anyone from making that 
mistake.
1.6. The shame with 1.6 (avoid spacing diacritics to emulate capital 
letters with left diacritics) is that this is how every 8-bit Greek 
font on earth has done those capital diacritics, so people just 
blindly convert them across to Unicode like that. Rule 6 should be 
shouted from the rooftops; and anyone working with Unicode Greek 
should realistically expect that they will get texts with this misuse 
of spacing diacritics (which includes just about every polytonic 
Greek text online --- TLG texts excepted :-) .)
1.7. Using pre-combined characters rather than combining diacritics 
is something I've been guilty of myself in designing a website using 
Unicode Greek; but pre-combined chars are deprecated for good reason, 
and this suggestion should be downgraded even further: it is 
emphatically only an interim solution, until smart fonts (like Minion 
Pro) are in widespread use, and should be avoided in any text to be 
further processed electronically. Believe me, you don't want to write 
a search engine to deal with pre-combined characters and still 
allowing diacritic-insensitive searches...
1.8. Unfortunately yes, some people do confuse psili and apostrophe; 
I've had to deal with this in legacy text myself.
1.9. Guillemets are standard typographical practice for quotations in 
Greece --- but not at all for Ancient Greek, the quotations for which 
tend to follow that of the publishing country. Though there is a 
special place in Hell for people using single quotes in Greek (as 
they are readily confused with psili and daseia), mandating 
guillemets for an audience including Western classicists is 
unwarranted.
Section 2 contains polemic against monotonic. Haralambous is entitled 
to his opinion; and you're entitled to mine, which is (a) good 
riddance (and the arguments made for the polytonic are anything but 
compelling); and (b) there is no way conceivable that polytonic Greek 
will make a comeback, when noone under 30 has learned it outside of 
Ancient Greek classes (and I say this as a 30 year old.) Polytonic 
may be holding on in *some* book publishing, but the majority of 
computer users will neither use it nor want to use it. To contend 
that monotonic is a local phenomenon, and that the needs of 10,000 
classical scholars in the West outweigh those of 10,000,000 Greek and 
Cypriot nationals, is Canute-like. Unicode should indeed recognise 
continuing contemporary use of polytonic; but that monotonic is 
official and prevalent, and the priority for any implementers, is 
beyond dispute.
2.2. On the mute iota: a reminder that the subscript/mute iota is 
indeed mute now, but was not mute in Classical Greek (it started 
dropping out in the 3rd century BC.) And in the inscriptions of the 
time, of course, it wasn't subscript at all: scribes only started 
indicating its muteness by subscripting with the invention of lower 
case, which all subsequent Greek typography has followed. Mute iota 
is fine as a Modern Greek name (though so is ypogegrameni!) --- but 
I'm not convinced classicists will like it. In any case, Unicode now 
conflates the subscript and the adscript in normalisation, so here 
too no dire results can come about. Furthermore, Haralambous is 
making the time-old glyph/character confusion: Unicode has an 
adscript glyph in its code chart for capital subscripts, but noone is 
forcing Haralambous to use that glyph if his typographical tradition 
wants capital subscripts instead. So to say that AiDHS (small cap 
subscript) and A|dhs (subscript instead of adscript) are "not 
conformant to Unicode v3.2" is misleading.
2.3. It is true that the Greek circumflex looks like the combining 
inverted breve rather than the combining (Roman) circumflex; but 
surely the sensible solution, as already occurs with the treatment of 
precombined characters, is to treat the tilde and the inverted macron 
as glyph variants of the perispomeni, and to deprecate the characters 
for tildes, inverted macrons, and Roman circumflexes as realisations 
of the perispomeni. (I've yet to see a dialectologist employ a Roman 
circumflex on a Greek vowel, but it's not beyond the bounds of 
reason. Plenty of use of inverted macron-perispomenis on consonants 
in dialectology, though, to indicate palatalisation.)
2.5. Haralambous is correct on the confusion: stigma is numeric, 
digamma alphabetic. (The stigma is originally the uncial version of 
the digamma --- but by the time people were using uncials, the 
digamma was only used as a number. So the bifurcation between stigma 
and digamma is exactly parallel to that between the Q-koppa and the 
S-koppa, the latter form also being mediaeval.)
3.1. I would want evidence of use of the uppercase Kai symbol. I've 
only ever seen it lowercase --- though I admit I've only ever seen it 
in old-style shopfronts. But I have only ever seen it lowercase even 
in all caps contexts. If the capital kai symbol represents 
typographical practice of long ago, I'm inclined to think this is 
better handled as a straight ligature for cased "Kai": old Greek 
typography didn't exactly skimp on ligatures, and Unicode doesn't 
need to know about them. Unless I see compelling evidence to the 
contrary, I don't think a capital Kai warrants inclusion in Unicode 
any more than the "esti" ligature. Symmetry with the lowercase kai 
ligature is not enough of a rationale for its inclusion.
3.1. On the other hand, capital Lunate sigma is long overdue, and 
I've never understood why it was omitted: the papyrologists that use 
it use case as much as anyone.
3.2. The reversed iota and upsilon are also still in use in Greek 
dialectology. They are of course the homebrew version of the Jod, 
though maintaining the distinctions between their etymological 
origins (upsilon, iota.)
It's an admirable quirk of Greek typography that people flipped iota 
circumflex and upsilon circumflex to get these jods; but I don't 
think Unicode should go down this path with discrete characters. 
You'll notice Haralambous' capital versions don't have tildes 
underneath, but breves. In fact, dialectologists in particular 
indicate jods for any combinations of characters pronounced as /i/ in 
the modern language; so you will see often enough eta with a 
combining breve underneath, or epsilon and iota with a combining tie 
(e.g. skol(ei)o = skoljo). Obviously eta breve should decompose the 
same way as iota breve, or the text becomes intractable; so I believe 
the correct solution for these is to encode these ersatz jods at the 
character level as letter + combining breve underneath (or tie), with 
the upside down tildes treated as glyph-variant ligatures (iota + 
breve underneath rendered through ligature as upside down 
iota-circumflex).
3.3, 3.4. This proposal has been rejected too often for me to repeat 
why. :-) What software and OS designers choose to implement or not as 
available combinations need not be any concern of Unicode's. If you 
need circumflexes on epsilons, or smooth breathings on cap upsilons, 
talk to Adobe, not Unicode. The rest of the world does not want yet 
more precomposed forms to normalise; and I'm surprised at 
Haralambous' insistence on this old ground.
-- 
[][][][]                   [][][][][][][][][][]                [][][][]
Dr Nick Nicholas. opoudjis@optushome.com.au    http://www.opoudjis.net
                   University of Melbourne: nickn@unimelb.edu.au
     Chiastaxo dhe to giegnissa, i dhedhato potemu,
     ma ena chieri aftumeno ecratu, chisvissemu.    (I Thisia tu Avraam)
This archive was generated by hypermail 2.1.2 : Thu Jun 06 2002 - 10:26:35 EDT