From: Shriramana Sharma (samjnaa@gmail.com)
Date: Mon Nov 29 2010 - 13:51:31 CST
On Mon, Nov 29, 2010 at 11:45 PM, Mahesh T. Pai <paivakil@gmail.com> wrote:
> Would identifying situations where rendering systems should not do glyph
> substitution / reordering, etc when faced with multiple scripts help in
> any way here?
It really is not necessary here. There is no question of rendering.
Only of encoded character streams.
If somebody has first registered [0C85 0CB0 0C97].com (Kannada ಅರಗ)
then later [0C05 0C30 0C17].com (Telugu అరగ) should not be permitted
to be registered, and vice versa.
So the rendering doesn't really matter here but it would matter in
situations where glyph reordering occurs in Indic (or elsewhere). For
example, if the glyph for Bengali E ে is confusable with something
else, say Bengali ২ (just imagine that that's confusable) then
BENGALI LETTER XXX + BENGALI VOWEL SIGN E
should be confusable with
BENGALI DIGIT TWO + BENGALI LETTER XXX
and a simple mapping of BENGALI VOWEL SIGN E to BENGALI DIGIT TWO (or
vice versa) would not be sufficient because the real confusion *in the
mind of the user* occurs *after rendering* and when the characters are
displayed on screen.
UTR36/39 don't seem to take this exactly into consideration.
http://unicode.org/reports/tr36/#TableCombiningMarkOrderSpoofing only
talks about when two vowel signs are applied to the same consonant --
not when one confusable is a reordrant vowel sign and another is a
non-reordering (usually consonant) character. Such cases are to be
found in other Indic scripts as well.
BTW I should mention that the example ARAGA is not really a word in
Kannada or Telugu AFAIK but just something I gave for its confusable
nature.
Shriramana.
This archive was generated by hypermail 2.1.5 : Mon Nov 29 2010 - 13:54:26 CST