Collation (was RE: [OT] o-circumflex)

From: Edward Cherlin (Edward.Cherlin.SY.67@aya.yale.edu)
Date: Thu Sep 13 2001 - 03:40:30 EDT


English and several other languages have dozens of collations. Compare telephone books, library catalogs, book indexes (sic), and other sorted data. Knuth vol. 3 Sorting and Searching gives an example of a set of library sorting rules that runs to more than a page, and suggests programming it as an exercise. ;-) Among the rules are to spell out numbers.
For example,

1984 (Nineteen Eighty Four)
1066 and all that (Ten Sixty Six)
3001 (Three Thousand One)
2050 (Twenty Fifty)
2010 (Twenty Ten)
2001, A Space Odyssey (Two Thousand One)

Bell Labs invented a whole programming language, Snobol, to deal with telephone listing conversions, matches, and sorts. Many phone books sort Mc- and Mac- together, others one after the other but separate from other names.

Edward Cherlin
Generalist
"A knot! Oh, do let me help to undo it."
Alice in Wonderland

> -----Original Message-----
> Behalf Of Michael (michka) Kaplan
> Sent: Mon, September 10, 2001 8:36 AM
> From: "Mark Davis" <mark@macchiato.com>
>
> > Michael, that isn't the point. There is a problem even
> when you stick to
> one
> > language.

> By that time, many langauges may have TWO collations, since
> users have been
> expecting something else for the last few decades?
>
> MichKa
>
> Michael Kaplan
> Trigeminal Software, Inc.
> http://www.trigeminal.com/
>
>
>



This archive was generated by hypermail 2.1.2 : Thu Sep 13 2001 - 03:27:54 EDT