RE: TR35

From: Peter Constable (petercon@microsoft.com)
Date: Fri May 14 2004 - 15:22:24 CDT

  • Next message: jcowan@reutershealth.com: "Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    On Behalf
    > Of Antoine Leca

    > I wrote about an electronic document, sorry, file, I might receive
    > containing an order form, and you said documents did not encompass
    order
    > forms, as I read it.

    An order form is not a case we can evaluate without actually analyzing
    in more detail exactly how information is being exchanged, whether
    public protocols are in use, and how the processes on each end are to
    work. It is simply inadequate analysis of usage scenarios to say "an
    order form contains formatted dates / numbers / currency that need to be
    interpreted, therefore this document has a locale". For instance, if the
    order information is exchanged using some XML schema involving, say

    <item id="79234CRX">
            <name>Buckwheat flour (bulk)</name>
            <amt>123,456</amt>
    </item>

    there's a very good chance that the order application was designed so
    that the number inside the <amt> element was in a locale-independent
    representation. In that case, there is no reason whatsoever to say
    anything more about this record than that English is used. (Actually, it
    would be most appropriate to simply say that the name element is in
    English: <name xml:lang="en">.) But if the <amt> record is *not* in a
    neutral representation, then there are several other questions that need
    to be considered regarding how the string was generated, and how the
    receiver knows what was assumed by the authoring process.

    The point is, we need to do analysis at that kind of level, not in
    sweeping terms like "order forms are documents that require locales".

    > And these files do
    > include or refer locale ids and language ids, sometimes named one for
    the
    > other BTW.

    Just because someone called the two the same doesn't mean that the
    notions are not distinct, and that it wouldn't be helpful for us to
    understand that distinction.

    > And what you see as "internal to
    > your process" is, to me, actually an usable, external, data.

    If you consider it external, then it is because you expect others to use
    what you put there, or you are using what others put there -- and so it
    is indeed external.

    > See my example,
    > imagining it is a text processing file: deeply inside, I have found
    the
    > locale id of the sender. Which was an hint, not the real data I would
    have
    > liked.

    If the document includes an ID that indicates the locale mode that was
    set in the author's software when the author created that file, and you
    wish to use that as a hint to set a processing mode on your end, I have
    no problem with that; I have never said anything against that.

    > To be able to have my job done, I sometimes (often, in fact) have to
    use
    > different softwares... Now, one can
    > just deface me saying that I am not supposed to look at that, that the
    users
    > should restrict themselves to the next release of XML. This is
    equivalent to
    > say, users are not invited to the discussions about the tools they
    will use...

    I have no qualms with what you may need to do now to get your job done.
    When all we have is a hammer, everything starts to look like a nail, and
    we need to wring as much benefit within that constraint as we can. All
    I'm saying is that we should be content to stay there. I have no intent
    of telling anyone they cannot do what they are doing. Rather, I'm saying
    that the conceptual model we have inherited from the past is inadequate,
    and that we need to adopt a more carefully-conceived model around which
    to design i18n platforms for the future. And it starts by understanding
    that while they may be related, "locale" and "language" are conceptually
    two different things. As for participating in the discussion, I am not
    trying to keep anyone out.

    > a very common behaviour of the computer people here in Europa, and a
    > behaviour I am very angry against (hence the sarcarms, for which I
    would
    > apologize).

    I was not aware of that background. Apology most kindly accepted.

    Peter
     
    Peter Constable
    Globalization Infrastructure and Font Technologies
    Microsoft Windows Division



    This archive was generated by hypermail 2.1.5 : Fri May 14 2004 - 15:23:15 CDT