There is a similar problem with Swedish:
Our alphabet goes:
a
...
u
v & w (no difference made)
x
y
z
å
ä (the Danish/Norwegian "æ" is also sorted as "ä")
ö (the Danish/Norwegian "ø" is also sorted as "ö")
The German character "ü" is pronunciated as a Swedish "y," so when any
German name or loan word containing that character occurs in Swedish it
should be sorted as "y." However, if any "ü" occurs in a Dutch loan word it
is considered as an "u" with umlaut and is sorted as "u."
The same goes for "ä" and "ö": If they are the Swedish/Finnish/German
letters "ä" and "ö" they are sorted after "å," if they are the Dutch letters
"a" with umlaut and "o" with umlaut, they're sorted as "a" and "o" in a
Swedish encyclopædia.
In Swedish the Danish/Norwegian letter "æ" is sorted as "ä," while the
Latin/Icelandic letter "æ" is sorted as "ae."
Stefan
----- Original Message -----
From: "Mark Davis" <mark@macchiato.com>
To: "Michael (michka) Kaplan" <michka@trigeminal.com>; "Keld Jørn Simonsen"
<keld@dkuug.dk>; <unicode@unicode.org>
Sent: den 10 september 2001 17:27
Subject: Re: [OT] o-circumflex
> Michael, that isn't the point. There is a problem even when you stick to
one
> language.
>
> That is, there are situations where two letters in a language, e.g. "ch"
in
> Slovak, are normally sorted as one. However, in some exceptional
> circumstances those letters should be sorted separated. It could be
because
> they come originally from another language, or it could be because they
> happen to arise when two other words are conjoined. There is no
algorithmic
> distinction. So without some special character, it would require a
> dictionary look-up to produce the right sort
>
> For example, suppose that "th" were sorted separately in English, after Z.
> Yet people would expect the following order:
>
> cast
> cathouse
> caul
> cathode
>
> because the "t" and "h" are logically separate in "cathouse".
>
> Mark
> —————
>
> Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Όμήρου Μαργίτῃ
> [http://www.macchiato.com]
> ----- Original Message -----
> From: "Michael (michka) Kaplan" <michka@trigeminal.com>
> To: "Keld Jørn Simonsen" <keld@dkuug.dk>; <unicode@unicode.org>
> Sent: Monday, September 10, 2001 5:48 AM
> Subject: Re: [OT] o-circumflex
>
>
> > From: "Keld Jørn Simonsen" <keld@dkuug.dk>
> >
> > > Real-life sorts, like MS Windows sorting or Linux sorting, actually
> > adheres
> > > to these Danish rules, once you have set up your machine for Danish.
> >
> > And this is the *true* answer to the whole mess of attempting
> *multilingual*
> > sorts -- once the user chooses the sort they WANT, the system might
handle
> > other language strings in a way that might be obscure to those who know
> the
> > other language but the person who expected Danish or whatever will see
> what
> > they want.
> >
> > Since various sorts openly conflict with each other there is no other
> > general case solution which would be appropriate, anyway?
> >
> > (can't believe this thread is still going on!)
> >
> >
> > MichKa
> >
> > Michael Kaplan
> > Trigeminal Software, Inc.
> > http://www.trigeminal.com/
> >
> >
> >
> >
>
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
This archive was generated by hypermail 2.1.2 : Mon Sep 10 2001 - 15:07:07 EDT