Re: Processing Digit Variants

From: Steven R. Loomis <srl_at_icu-project.org>
Date: Wed, 20 Mar 2013 12:24:25 -0700

For general purpose locale data such as in LDML, and a general purpose
library such as ICU, a number is something that a user is simply typing
from a keyboard, not necessarily any textual representation of a number.

On Tue, Mar 19, 2013 at 11:11 PM, David Starner <prosfilaes_at_gmail.com>wrote:

> On Tue, Mar 19, 2013 at 10:13 PM, Steven R. Loomis <srl_at_icu-project.org>
> wrote:
> > Richard,
> > For parse, it's pretty simple: U+0031 has a Unicode digit value. U+FE0E
> > does not. ( Nor is it part of the defined numbering systems in LDML - see
> > http://unicode.org/reports/tr35/#Numbering System Data )
> > So, U+FE0E is the end of the sequence - not a number. End of parsing.
> >
> >>
> >> > > 10<ZWJ>0<ZWJ>0 would be perfectly reasonable for text
> >> > > likely to be rendered by a cursive Latin font
> >
> >
> > It's not reasonable for numeric parsing, however.
>
> Which is one of those things that frustrate people to no end.
> Invisible characters that mean that numbers aren't actually numbers
> will mean that somewhere, someone will beat their head against the
> desk and probably eventually work around a problem they will never
> understand.
>
> --
> Kie ekzistas vivo, ekzistas espero.
>
Received on Wed Mar 20 2013 - 14:27:03 CDT

This archive was generated by hypermail 2.2.0 : Wed Mar 20 2013 - 14:27:05 CDT