RE: Danish & Norwegian sorting

From: Karlsson Kent - keka (keka@im.se)
Date: Tue Sep 15 1998 - 08:05:53 EDT


Patrik and Kolbjörn have answered on this in some detail. And I generally
agree with what they say. However, when looking at various standards,
customs and proposals on this there are always slight differences. Is V and
W to be sorted together or not, how should the LETTER AE (æ) be treated, and
so on. And how should letters that may occur, but are very uncommon in
some language be handled? Like thorn (þ) in Swedish: a separate letter, a
variant of "t", or as a variant of "th"?

In addition, for names in phonebooks and the like, one often "move together"
spelling variations of the "same" name (like Karlsson/Carlsson/Carlzon and
many others).

So it is no surprise that you get conflicting demands...

A few comments on the details:

Borrowed words/names are usually sorted according to the local rules. Note
that in the Nordic countries Üü ("German y") is almost invariably sorted
among the Yy, NOT among the Uu.

For Swedish aa, ae, oe are invariably looked upon as NOT being in any way
equivalent to å, ä, ö (in sorting, even though *some* ASCII-7 authors still
*use* these digraphs, but only in an ASCII-7 setting, NOT in normal writing,
as the text becomes very hard to read).

Danish/Norwegian usually considers aa as a way of writing å (and sort
accordingly). The similar comment does NOT apply to ae and oe.

Mostly, ö is considered to be the same base letter as ø, and usually ä is
considered to be the same base letter as æ (but some say æ should be sorted
either as "ae" or as a variant of "a" for Swedish).

For Swedish (and sometimes for Danish too?) V and W are usually considered
to be the same base letter (w is used in only one Swedish word: webb..., but
it is common for person names to use W instead of V, there is no
pronunciation difference between V and W for these languages).

Note also that Sweden and Finland use the order ...zåäö, whereas Norway and
Denmark use ...zäöå (i.e., ...zæøå).

                /kent k

PS
The current relevant (semi-official) Swedish standard (written in English)
for this can be ordered from Statskontoret (http://www.statskontoret.se;
mailto:publikations.service@statskontoret.se). Ask for Teknisk Norm nr 34,
Swedish alphanumeric sorting. It is unfortunately not available in
electronic form. It uses 7 (seven!) levels. I would say that some of those
levels can be collapsed... (There is also an official Swedish standard on
this, which unfortunately is not quite as helpful, it does not use any
formalised sense of levels, and it is only available in Swedish..., it is
also more geared towards manual filing.)

There is also work going on on an ISO standard, 14651, for international
collation of strings covering all of (current) Unicode/10646. It only uses
four levels... The intent is that given this 'base collation'/'common
template' one can (shall) define 'deltas'/'tailorings' to suit (at least
close enough) the collation rules used in various countries/languages
world-wide.

> -----Original Message-----
> From:
> =?iso-8859-1?Q?Kolbj=F8rn?==?iso-8859-1?Q?_?==?iso-8859-1?Q?Aa
> mb=F8?=@un
> icode.org
> [mailto:=?iso-8859-1?Q?Kolbj=F8rn?==?iso-8859-1?Q?_?==?iso-885
> 9-1?Q?Aamb
> =F8?=@unicode.org]
> Sent: den 15 september 1998 07:49
> To: Unicode List
> Subject: Re: Danish & Norwegian sorting
>
>
> >Is anyone familiar with the current practice in Denmark & Norway for
> >collating AE vs Æ (A-E-ligature), OE vs Ø (O-stroke) and AA
> vs. Å (A-ring)?
> >
> >I'm receiving conflicting requirements for these characters,
> as well as
> >confusing information about collating borrowed words
> containing A/O/U with
> >diaresis (umlaut). My understanding is that the correct sort
> sequence is:
> > A, B, C, .... X, Y, Z, AE/Æ, OE/Ø, AA/Å
> >but that many commercial situations have decided to sort the
> letter pairs
> >as in English, and sort only the Danish characters after Z.
> I have no idea
> >where to place the German letters: should the U-umlaut be
> sorted after U or
> >with Y, the A-umlaut after A or with Æ?
> >
> >What does Dansk Standardiseringsråd say? I believe there's
> also a Dansk
> >Sproginstitut (or something like that): anyone know what
> their view is on
> >this?
> >
> >Brendan
>
> Norwegian ordering is as follows
>
> Aa, Bb, Cc....,Yy:Üü, Vv..., Zz,Ææ:Ää, Øø:Öö,Åå:<Aa><aa>.
>
> Æ have the name LATIN LETTER AE ...(Ash) in 10 646 by the way....
> Notice that a double a (Aa or aa) is ordered as Å and å. Å
> replaced Aa in
> the 1917 Norwegian writing reform. The same happened in
> Danish in 1948.
>
> Æ can be displayed AE, Ø can be displayed OE and Å can be
> displayed as AA
> in 7-bit ASCII when there are no alternative ways of
> displying them. AE and
> OE are not legitimate variants of Æ and Ø as such and should
> therefore be
> regarded as separate characters when written as separate characters.
>
> I have been representing Norway in the JTC1/SC2 in the
> ISO/IEC 10 646 work.
> Kolbjørn Aambø,
> University of Oslo Library.
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT