Comparing Raw Values of the Age Property
Richard Wordingham via Unicode
unicode at unicode.org
Mon May 22 17:48:19 CDT 2017
On Mon, 22 May 2017 15:10:02 -0700
Markus Scherer via Unicode <unicode at unicode.org> wrote:
> On Mon, May 22, 2017 at 2:44 PM, Richard Wordingham via Unicode <
> unicode at unicode.org> wrote:
> > Given two raw values of the Age property, defined in UCD file
> > DerivedAge.txt, how is a computer program supposed to compare them?
> > Apart from special handling for the value "Unassigned" and its short
> > alias "NA", one used to be able to compare short values against
> > short values and long values against long values by simple string
> > comparison. However, now we are coming to Version 10.0 of Unicode,
> > this no longer works - "1.1" < "10.0" < "2.0".
> This is normal for numbers, and for multi-field version numbers.
> If you want numeric sorting, then you need to either use a collator
> with that option, or parse the versions into tuples of integers and
> sort those.
Well, comparing "15.1" and "15.12" gives different answers depending on
whether you view them as decimal numbers or a hierarchical sequence of
> Can one rely on the FULL STOP being the field
> > divider,
> I think so. Dots are extremely common for version numbers. I see no
> reason for Unicode to use something else.
But where is that stated?
> and can one rely on there never being any grouping characters
> > in the short values?
> I don't know what "grouping characters" you have in mind.
Comma is the obvious one.
Looking to the far future (I trust you've heard of the predicted Cobol
crisis for the Y10k problem), will we have "1000.0" or "1,000.0"?
More information about the Unicode