Re: Telugu vs Kannada confusables

From: Christopher Fynn (chris.fynn@gmail.com)
Date: Sun Nov 28 2010 - 12:32:38 CST

  • Next message: Martin v. Löwis: "Re: UNICODE version of _T(x) macro"

    On 27/11/2010, Shriramana Sharma <samjnaa@gmail.com> wrote:
    > On Sat, Nov 27, 2010 at 5:29 PM, Christopher Fynn <chris.fynn@gmail.com>
    > wrote:
    >> I wonder, in a case like this, which of the two scripts takes precedence?
    >
    > Where's the question of precedence? As I understand it, confusable
    > mappings go from higher codepoint to lower codepoint, so it's just a
    > question of folding -- something like case folding.
    >
    > అరగ ಅರಗ అರగ ಅరಗ (you'll have to look at that via UniView to get the
    > difference) will all fold to à°…à°°à°— (all Telugu) if I am not mistaken
    > (provided the appropriate Confusables.txt entries are present) and
    > given that mixed script domain names are (almost) prohibited, whoever
    > registers whichever first -- whether all Kannada or all Telugu -- will
    > get precedence, even though in internal processing the Kannada
    > codepoints will fold to Telugu.
    >
    > @ the techies here: I hope I got that right...
    >
    > Shriramana Sharma.

    Do the folded domain names get displayed?

    If so, doesn't this have the potential to break some rendering
    systems? OpenType rendering engines usually do not apply the the glyph
    substitutions and positioning necessary to form the conjuncts in Indic
    scripts accross script boundaries (anyway the glyphs for Kannada and
    Telugu are likely to be in seperate fonts). If you have to dispaly
    mixed Kannada and Telugu characters in a name the results might look
    pretty odd.

    - C



    This archive was generated by hypermail 2.1.5 : Sun Nov 28 2010 - 12:39:16 CST