Re: Fun with proof by analogy, was Re: Mojibake on my Web pages

From: Peter Kirk (peterkirk@qaya.org)
Date: Fri Sep 26 2003 - 05:21:59 EDT

  • Next message: Peter Kirk: "Re: Internal Representation of Unicode"

    On 26/09/2003 00:14, Peter_Constable@sil.org wrote:

    >>>The last agent handling the document would be the mail carrier.
    >>>Does the mail carrier have the right to open the mailing and
    >>>replace your document with garbage?
    >>>
    >>>
    >>No, however if I receive a letter in the post written in German I'm
    >>going to ask someone to translate it rather than try to cope with a
    >>language (c.f. encoding) I don't understand.
    >>
    >>
    >
    >Unlike Jame's cup of wine, this really is a good analogy. Suppose the
    >document is stored on the server in ISO 8859-1 and the browser requesting
    >the page understands only EBCDIC. The server must convert it -- if it
    >doesn't, it will appear on the client as complete garbage. As Jon
    >mentioned, the server is the last one to touch it, and this illustrates
    >why it is appropriate for the server to touch it.
    >
    >
    >Peter
    >
    >
    >
    >
    >
    >
    Is server software actually obliged to perform such conversions on
    request? Surely, rather, browsers should be expected to support a
    certain minimum set of encodings, or else it should be left to the
    content provider and the reader and/or their software to agree on
    something acceptable. After all, if someone in China sends me snail
    mail, the mail carrier is not under any obligation to translate it for
    me. On the contrary, I would be offended if they tried, without my
    explicit permission, on the basis that the content of the letter is none
    of their business. I need to agree with the sender to send it in English
    rather than Chinese, or else get it translated myself.

    In any case, I would assume that any in practice any browser can at
    least understand ASCII, and if presented with a page in UTF-8 will at
    worst display 0020-007F correctly and the rest as some kind of mojibake.
    And if it can't understand any other script in UTF-8, chances are it
    can't understand whatever the coding it is presented with, so there is
    little gained by converting it to some specific code page.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Fri Sep 26 2003 - 06:13:28 EDT