Re: sequences and stuff

From: G. Adam Stanislav (adam@whizkidtech.net)
Date: Thu Nov 30 2000 - 19:51:32 EST


On Thu, Nov 30, 2000 at 04:55:15AM -0800, Michael Everson wrote:
>We're working on this; actually I am writing a paper which deals with some
>of the proposed solutions. That should be ready in a day or so. In the
>meantime, can you give me an example of a Czech or Slovak word in which
><ch> is a grapheme, and another in which <c><h> meet at a morpheme
>boundary? It would help me quite a lot.

Wow, someone else has introduced the topic I have raved about repeatedly. :)

Anyway, in Slovak, ch is always a single unit. But Braòo has a point: A
text may be multi-lingual, in which case some words may use 'ch' as a
grapheme which should be sorted after 'h', while others may use it as
two separate characters.

As for 'dz' and 'd¾', it's not really a problem simply because taken
as a single unit it is sorted lexicographically exactly the same as
when taken as two separate characters.

Incidentally, when transliterating from Greek (chi), ch is really a
single unit in other languages as well.

Adam

-- 
Life is not just a matter of holding good cards,
but sometimes of playing a poor hand well.
		-- Robert Louis Stevenson



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT