ME>Aye, there's the rub. It's easy for us to say, well, 7 is
really just a glottal stop. But we're talking natural
orthographies here. User sensitivities, legibility,
recognizability, etc. do have to be taken into account. (The
case is stronger for ? being a glyph variant of the glottal
stop.) At the end of the day, 7 is _their_ letter, not ours.
Definitely a valid concern.
>The linguist knows this; the schoolchildren do not, and if
they've been using 7 for years and years.... I mean, what if
they regularly write DIGIT 7 with a stroke through it and
LETTER 7 without? (I've no idea, it's just a thought.) The IPA
glottal stop isn't the most beautiful of creatures.
Indeed, it is not. And I was very surprised to learn that this
very glyph has been adopted for use together with Devanagari
script for some languages of Nepal. I would never have thought
to propose such a shape - it just doesn't seem to me to fit
with the design of that script. Perhaps, though, to the native
users it seems perfectly natural.
>Agreed that they shouldn't type DIGIT 7. We should think about
LETTERs 3, 4, and 7, though, and look at various fonts and
things to see if people encoded them with separate code
positions and/or unique shapes to differentiate the numbers
from the letters.
I concur about digit 7. I really doubt that we'd find people
encoding digit 7 separately from letter 7; in most cases, I
expect, they haven't been doing any information processing that
would care (that may be a naive assumption, though). Such
things are quickly changing in the world today, though, and I
wouldn't want to encourage them to maintain this kind of
ambiguity.
In the mean time, I suppose the Thais don't need to worry that
the Unicode digit police will come knocking on doors regarding
the Thai tone marks. (Those already have separate codepoints.)
:-)
That reminds me of another issue along these lines: What to do
for languages like Chinantec and Mixtec that write using Latin
script and indicate tones using superscript letters? (The tone
systems of these languages are far more complex than those of
African languages, so diacritics like acute and grave didn't
suffice. Yes, it's *far* from great, but apparently it was the
best option.) Can the superscript 1 - 5 characters
U+00B9
U+00B2
U+00B3
U+2074
U+2075
work for these? They are distinct from U+0030-0039, but the
latter are the compatibility decompositions, and the semantics
aren't ideal (general category: No; bidi category, EN).
Thoughts?
Peter
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT