Re: Transcoding Tamil in the presence of markup

From: John Delacour (JD@BD8.COM)
Date: Sun Dec 07 2003 - 13:46:00 EST

  • Next message: John Hudson: "Re: Transcoding Tamil in the presence of markup"

    At 2:43 pm +0100 7/12/03, Peter Jacobi wrote:

    >Then you consider
    > <span style='color:#00f'>&#x0BB2;</span>&#x0BCA;
    >to be valid input, which ideally should render as intended?

    I have uploaded a valid page to

    <http://bd8.com/temp/tamil_unicode_tscii.html>

    where you should see the lo properly displayed (in the second case).
    As to the TSCII stuff I have simply followed your encodings, which
    seem to give different glyphs, but maybe the first font in my list
    (MylaiTSC) is encoded differently -- so much for unregistered legacy
    encodings.

    >Then you consider
    > <span style='color:#00f'>&#x0BB2;</span>&#x0BCA;
    >to be valid input, which ideally should render as intended?

    In your TSCII version you write
    &#xa7;<span>&#xc4;</span>&#xa1;

    is that not equivalent to Unicode

    &#xbc6;<span>&#xbb2;</span>&#xbbe;

    >From a processing point of view, it is somehwat challenging, as you
    >may have to parse through lots of markup, until you know what to do
    >with the 0BB2.

    That seems fairly easy. I must be missing the point.

    >As I've understood from other posts, the font support for
    >all this is theoretically available, but not often done in practice.

    For Windows browsers I find I have to specify a Unicode font (in this
    case Arial Unicode MS) in order for pages to display properly without
    the user fiddling with his browser preferences. As I said I have
    WinNT 4.0 so maybe this has changed now. The Mac browsers (Safari,
    OmniWeb) require no font to be specified and will display the correct
    characters no matter what the user's defaults. I have nothing to do
    with Mozilla.

    JD



    This archive was generated by hypermail 2.1.5 : Sun Dec 07 2003 - 14:32:00 EST