RE: Unicodes For Devanagari: Magic The Gathering Card

From: Victor G Campbell (
Date: Wed Nov 06 2002 - 11:24:00 EST

  • Next message: Kent Karlsson: "RE: Names for UTF-8 with and without BOM - pragmatic"


    Thank you for the input. I have altered the test page to include some of
    your suggestions regarding the Unicode glyphs. You may need to refresh the
    page to see them.

    My main goal is to reproduce the appearance of the card's text exactly. It's
    becoming apparent that this may not be possible. At this point, whether the
    publisher made the best character choices is moot. I just have to live with
    it, inconsistencies and all. Completing the rest of the card's text is going
    to be even more difficult. My only choice may be finding a font with all the
    characters (I can't find some of them in Sanskrit 98) and using a GIF on the


    Victor Campbell wrote:
    > I'm looking for help with converting the text of a Sanskrit
    > trading card to
    > Unicode. I am not connected with the publisher of the card, just a
    > programmer who helps support a site for collectors.
    > I have set up a test page for experimenting with the
    > Devanagari Unicodes at
    > this address:

    || 1. This is what I want: Fungal Shambler

    This is NOT what you want! :-)

    You say that you want a vocalic-L (U+090C, decimal 2316), but the glyph I
    see in the picture is a CONSONANT LA (U+0932, decimal 2354).

    The LA glyph in the Sanskrit 98 font looks different from the LA glyph in
    the Unicode charts: in your font, the right side of the letter is rounded,
    while on the charts it is straight line. But it is nevertheless the same
    letter, coming in two slight typographic variants.

    Notice that vocalic L has a "tail" under the letter: *that* is an essential
    trait, and the characteristics that distinguishes the consonant from the

    || . This I get instead of what I want.
    || pha anusvara ga la virama sha ma virama ba virama [ZWJ] vocalic-l ra

    The sequence <..., ba, virama, ZWJ, vocalic-l, ...> is wrong. It should be
    <..., ba, virama, la, ...>:

            ..., U+092C, U+094D, U+0932, ...

    Or, in decimal HTML:

            ... &#2348;&#2381;&#2354; ...

    Apart this, the encoding is correct. (The ZWJ is not wrong, just useless.)

    || (The desired form of sha, and ba joined with the desired form of

    If you still see it incorrectly, it is because your font or operating system
    doesn't fully support Indic rendering.

    You can upgrade your PC to a different font or operating system but,
    unfortunately, there is nothing you can do to ensure that your users will do
    the same.


    BTW, a couple of side notes about the transliteration:

    - There is a special character to transliterate European "f": U+095E (dec.
    2398). It looks like PHA with a dot under it.

    - Using anusvara for the "n" in "fungal" and a MA for the "m" in "shambler"
    seems inconsistent. I'd either use anusvara for both or NGA (U+0919, decimal
    2329) for the first one and MA for the second one.

    - "Fungal shambler" is two words in English: why did you join them in



    _ Marco

    This archive was generated by hypermail 2.1.5 : Wed Nov 06 2002 - 11:57:32 EST