Re: [OT] o-circumflex

From: David Gallardo (dgallardo@mediaone.net)
Date: Fri Sep 07 2001 - 13:06:51 EDT


As a practical matter, you need to take the diacritics into account when
sorting, even in English where they (may or may not) have linguistic
significance, otherwise you'll get nondeterministic behaviour. In other
words, résumé and resume should fall together, but always in the same order.

Someone in another message mentioned "ñ". This is a different case in
principal, because in Spanish it's not a case of letter modified by a
diacritic--it's an entirely different letter. (It used to be written as two
side-by-side "n"s and then they got stacked.) Again as practical matter, in
English, it's most common to ignore the greater distinction, (because we
have only 26 letters in our alphabet), and to treat it as a letter +
diacritic for the same considerations as above.

----- Original Message -----
From: "Ayers, Mike" <Mike_Ayers@bmc.com>
To: "'David Starner'" <dstarner98@aasaa.ofe.org>; <unicode@unicode.org>
Sent: Thursday, September 06, 2001 5:12 PM
Subject: RE: [OT] o-circumflex

>
> > From: David Starner [mailto:dstarner98@aasaa.ofe.org]
> > Sent: Thursday, September 06, 2001 01:40 PM
>
> > On Thu, Sep 06, 2001 at 04:03:07PM +0200, Thierry Sourbier wrote:
> > > The only little thing to know about French and diacritical
> > mark is that when
> > > doing a sort diacritical mark are evaluated from right to
> > left. (e.g.
> > > "cote" < "côte" < "coté" vs the English order "cote" <
> > "coté" < "côte" ).
> >
> > I'm not sure there is an established English sort order. It's not a
> > problem that comes up much in English.
>
> I believe that there is an established sort order in English, which
> is to sort without regard to diacritics, or else we'd never find the
words!
> In English (American English more than British English), diacritics are
> considered optional, and it is common to see "naїve" written "naive", "San
> José" written "San Jose", etc. Especially amongst Americans, the two are
> considered equivalent, and I know of no word pair in all of English which
is
> separated only by a diacritic.
>
>
> /|/|ike
>



This archive was generated by hypermail 2.1.2 : Fri Sep 07 2001 - 13:44:40 EDT