From: busmanus (busmanus.lk@freemail.hu)
Date: Wed Aug 04 2004 - 14:52:07 CDT
Marcin 'Qrczak' Kowalczyk wrote:
> W liście z pią, 23-07-2004, godz. 18:01 +0200, Philipp Reichmuth
> napisał:
>
>
>>However, to return to the original problem, I don't remember ever having
>>seen a data where it would be necessary to distinguish between trema and
>>diaeresis in the data itself.
>
>
> A similar issue: a Polish encyclopaedia I have from 1985 sorts words
> with Ó differently depending on whether this is Polish Ó (sorted between
> O and P, like other Polish letters are after letters without accents)
> or foreign Ó (folded with O, like other foreign accents are folded).
> It's typeset in the same way.
>
> MOQUETTE
> MÓR [mo:r], city in Hungary
> MORA
> MÓRA [mo:ro] Ferenc, Hungarian writer
> MORACZEWSKA
> [...]
> MOŻNOWŁADZTWO
> MÓR (a Polish word)
> [...]
> MÓŻDŻEK (a Polish word)
> MPHAHLELE
The context is somewhat different in these two cases though: in the case
of Umlaut vs. Tréma, the distinction is between two different
well-defined functions of the same diacritic that traditional German
scholarship is aware of (if by no other reason, at least because of the
influence of the rather significant body of Classical Greek scholarship
that Germany produced), even if the use of one them is foreign to
lexical items of native German vocabulary.
In the case of Ó in Polish, there is the native function (using Ó
to write an U that is etymologically connected with an O, if I'm not
mistaken) on one side, and there are all the non-native functions
(Hungarian Ó denoting a long O, Spanish Ó denoting an accented O - this
may be the case with Portuguese Ó as well, but I'm not sure -, and then
there's the Icelandic and and Irish Gaelic Ó, which may have a fourth
and a fifth function), all grouped together on the other side.
Although I'm not aware of sorting native and non-native Ó any
differently in Hungarian encyclopedias, but it may happen in one or two
of the other languages I listed (or in yet others I'm not aware of). I'm
rather prone to think that using e.g. a plain COMBINING ACUTE vs.
CGJ + COMBINING ACUTE is a dangerous way of approaching problems like
Polish vs. non-Polish Ó. The relevant point here seems to be the
language the word is in (I understand Unicode also has standard language
markers defined in its inventory).
Regards,
bushmanush
____________________________________________________________________
Miert fizetsz az internetert? Korlatlan, ingyenes internet hozzaferes a FreeStarttol.
Probald ki most! http://www.freestart.hu
This archive was generated by hypermail 2.1.5 : Wed Aug 04 2004 - 14:41:13 CDT