From: karl williamson (public@khwilliamson.com)
Date: Mon Jul 26 2010 - 21:24:05 CDT
Asmus Freytag wrote:
> On 7/25/2010 6:05 PM, Martin J. Dürst wrote:
>>
>>
>> On 2010/07/26 4:37, Asmus Freytag wrote:
>>
>>> PPS: a very hypothetical tough case would be a script where letters
>>> serve both as letters and as decimal place-value digits, and with modern
>>> living practice.
>>
>> Well, there actually is such a script, namely Han. The digits (一、 
>> 二、三、四、五、六、七、八、九、〇) are used both as letters and as 
>> decimal place-value digits, and they are scattered widely, and of 
>> course there are is a lot of modern living practice.
> Martin,
> 
> you found the hidden clue and solved it, first prize :)
> 
> They do not show up as gc=Nd, nor as numeric types Digit or Decimal.
> 
> The situation is worse than you indicate, because the same characters 
> are also used as elements in a system that doesn't use place-value, but 
> uses special characters to show powers of 10.
> 
> However, as I indicated in my original post, in situations like that, 
> there are usually some changes in practice that took place. Much of the 
> living modern practice in these countries involves ASCII digits. While 
> the ideographic numbers are definitely still used in certain contexts, 
> I've not seen them in input fields and would frankly doubt that they 
> exist there. I would fully expect that they are supported as number 
> format for output, at least in some implementations, and, of course, 
> that input methods convert ASCII digits into them. In other words, I 
> wonder whether automatic conversion goes only one-way for these numbers. 
> I would suspect it, for the general case, but I don't actually know for 
> sure.
> 
> For someone in Karl's situation, it would be interesting to learn 
> whether and to what extent he should bother supporting these numbers in 
> his language extension.
I would think I wouldn't support these numbers, since we couldn't be 
unambiguously sure of what was intended.
Another issue that I brought up a while back on this list is Tamil 
numbers, where western practice seems to have infiltrated enough that 
Unicode gave them Gc=Nd, but IIRC from the responses I got back then, 
they can appear in older style with other characters meaning 10, 100, 
1000.  In implementing this, if any of the other characters were 
encountered in parsing such a number, it would disqualify it.
> 
> A./
> 
This archive was generated by hypermail 2.1.5 : Mon Jul 26 2010 - 21:28:48 CDT