Re: Sorting Numbers

From: Jonathan Coxhead (jonathan@doves.demon.co.uk)
Date: Mon Mar 13 2000 - 14:03:01 EST


 | Correct ordering of at least integer numbers is a trivial
 | extension of the algorithm:
 |
 | - you treat leading zeros as second level entities
 | - you sort digits as first-level entities
 | - you prefix conceptually any substring of digits
 | with a first-level sorting marker that indicates the length of
 | the digit string.

   But there's a bug here (never a good sign in a "trivial" algorithm
:-): it needs an infinite supply of "1st-level sorting markers", i e,
it assumes that the problem is already solved.

 | Example: you transform
 |
 | 1
 | [...]
 |
 | into
 |
 | [1]1
 | [...]

   What do you transform 1000000000 into? Presumably,

      [10]1000000000

but that sorts incorrectly w r t [1]1, [2]10.

   I ask because I've considered the problem myself, but gave up at
this point because it seemed to hard. Any insight appreciated!

 | I can play the nasty user by
 | entering "one billion" and your sorting scheme breaks down
 | immediately.

   But this is sometimes entirely appropriate. If everything is
sorted as if it were back-transcribed from a speech stream, problems
with digits (and other "funny" characters too) go away. After all, it
would be odd if _Nineteen Eighty-Four_ was to be found in 2 different
places in the bookshop, simply because some publishers render it as
_1984_: file it under "N" in either case, say I!

        /|
 o o o (_|/
        /|
       (_/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT