From: John Hudson (tiro@tiro.com)
Date: Mon Aug 01 2005 - 20:35:40 CDT
Gregg Reynolds wrote:
> Maybe its the size of the problem I'm not understanding. To take your
> example, let's suppose that RTL digits 0-9 are approved tomorrow.
> They're no different than their LTR equivalents, except for the
> typesetting semantics. That is, they share the same "underlying
> Platonic character", if I've understood you: they mean the number three.
> They just have different *typographic* semantics.
There is no concept of 'typographic semantics' in Unicode. (I'll leave it to the
philosophers to debate whether the Unicode notion of 'abstract character' is the same as
your 'underlying Platonic character'.)
You are proposing encoding of separate Unicode characters for RTL digits. Ergo, two
possible ways to encode each digit, and a major rewrite of existing software (including
updates to the cmap tables of all Arabic and Hebrew fonts) to ensure that these two sets
of characters are treated as if they were the same characters for numeric searching and
sorting. I don't see any way to do this that doesn't reimplementing a major aspect of RTL
text processing from scratch, with attendant expense and wastage of previous work. Maybe
it would have been a good idea about fifteen years ago, but now it is an economic
non-starter no matter what one thinks of the virtue of the idea itself.
> It is
> very clear to me that the only reason anybody uses such software is
> because they have no other choice, not because they are satisfied with it.
So improve the software. Determine correct behaviour for specific characters and desired
input methods and demand that applications get it right. Ripping out the foundations
because you don't like the wallpaper doesn't make a lot of sense.
John Hudson
-- Tiro Typeworks www.tiro.com Vancouver, BC tiro@tiro.com Currently reading: Dining on stone, by Iain Sinclair
This archive was generated by hypermail 2.1.5 : Mon Aug 01 2005 - 20:38:28 CDT