Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

Date: Wed Jun 02 2010 - 06:27:37 CDT

  • Next message: Doug Ewell: "Emoji (was: Re: Preparing a proposal for encoding a portable interpretable object code into Unicode)"

    From: William_J_G Overington (
    > On Tuesday 1 June 2010, John H. Jenkins <> wrote:
    > > First of all, as Michael says, this
    > > isn't character encoding.
    > Well, it is a collection of portable interpretable object code items encoded
    > within a character encoding as if the items were characters.

    There is a monumental gap between "items encoded ... is if [they] were
    characters" and actual characters. This gap is the gap between Unicode and not

    > > You're not interchanging plain text.
    > True, but the items are interchanged as if they were plain text items within
    > the structure of the way that plain text is interchanged.

    Lots of things are interchanged. Machine code is interchanged, Scalable Vector
    Graphics are interchanged, executables are interchanged. None of these are
    plain text. They should not be interpreted as plain text. They should not be
    displayed as plain text, except for providing a way for those who understand
    the "text" as merely a representation of bytes of data that have a non-plain
    text meaning, so they can check the data. Object code is not Unicode, it is
    something else.

    > > This is essentially machine language
    > > you're writing here, and there are entirely different venues
    > > for developing this kind of thing.
    > Well, it is an object code for a virtual machine rather than a machine code
    > for a virtual machine as external name links can be included. Also, it has
    > high level language style constructs of while loops and repeat loops rather
    > than the jump to an address instructions of a typical machine code. Also, it
    > is relocatable in relation to the underlying memory structure of the host
    > computer: some machine codes can be relocatable as well, so I am not claiming
    > relocatablity as a distinguishing feature from machine code, I am just
    > mentioning the relocatability feature of the portable interpretable object code.

    There is not difference between a virtual machine code and a physical machine
    code toa CHARACTER encoding standard. The fact that it has a high level
    language style means nothing, absolutely nothing. C code is C code, whether it
    is encoded as ASCII, Unicode, ISCII, Big 5, Shift-JIS, or anything else. The
    details of object code are immaterial to it being fundamentally a form of
    machine language, not a character.

    > > Secondly, I have virtually no idea what problem this is
    > > attempting to solve unless it's attempting to embed a text
    > > rendering engine within plain text. If so, it's both
    > > entirely superfluous (there are already projects to provide
    > > for cross-platform support for text rendering) and woefully
    > > inadequate and underspecified. Even if this were
    > > sufficient to be able to draw a currently unencoded script,
    > > the fact of the matter is that it doesn't allow for doing
    > > anything with the script other than drawing.
    > > (Spell-checking? Sorting? Text-to-speech?)
    > The portable interpretable object code is intended to be a system to use to
    > program software packages to solve problems of software globalization,
    > particularly in relation to systems that use software to process text.
    > > Unicode and ISO/IEC 10646 are attempts to solve a basic,
    > > simply-described problem: provide for a standardized
    > > computer representation of plain text written using existing
    > > writing systems.
    > Well, that might well be the case historically,

    It is the case now.

    > yet then the emoji were invented and they were encoded.

    Every writing system was invented.

    > The emoji existed at the time that they
    > were encoded, yet they did not exist at the time that the standards were
    > started.

    Immaterial. The question is whether they ARE plain text that is used as Plain

    > So, if the idea of the portable interpretable object code
    > gathers support, then maybe the defined scope of the standards will
    > become extended.

    No. Unicode encodes plain text. Period. Emoji are no different. They are
    exchanged as plain text, and act as plain text. They were not encoded before
    they were exchanged as plain text, they were only encoded ONCE they were used
    as plain text. The key word here, and everywhere else, is "Plain text". If
    it's not Plain text, it is not, has never been, and never will be germaine.

    > > That's it. Any attempt to use
    > > the two to do something different is not going to fly.
    > Well, I appreciate that the use of the phrase "not going to fly" is a
    > metaphor and I could use a creative writing metaphor of it soaring on
    > thermals above olive groves, yet to what exactly are you using the
    > metaphor "not going to fly" to refer please?

    You know perfectly well what it means, seeing as you speak native, colloquial
    English. You may think it's cute, but the people who have responded to you are
    serious people who have dedicated their lives to addressing the real issues of
    globalization, and it is both disrespectful and counterproductive to make
    comments like this.

    > I know of no reason to think that a person "skilled in the art" would be unable
    > to write an iPad app to receive a program written in the portable interpretable
    > object code arriving within a Unicode text message and then for the program to
    > run in a virtual machine within the app, displaying a graphical result on the
    > screen of the iPad. Could such an app be written based on the information in the
    > paper_draft_005.pdf document?

    Which is the single most important evidence to say that this is NOT plain
    text, and is completely, totally, and undeniably not Unicode.

    > The Unicode Technical Committee considers proposals. If a proposal for encoding a
    > portable interpretable object code becomes placed before them, then the Unicode
    > Technical Committee will be able to assess the proposal in accordance with their
    > rules as those rules stand at the time.

    The people who have responded to you have immeasurable experience with the
    Unicode Technical Committee. They are telling you EXACTLY what will be said in
    a UTC meeting. There is no subterfuge. They aren't hazing the newbie. This is
    the knowledge of experts who know the Standard inside and out, because they
    helped write it. Many of the people on this list actually ARE members of the
    UTC, and if they disagreed in the slightest, they would instantly chime in. No
    one is coming to your side and saying "there is a small chance that this could

    > > Creating new writing systems, directly embedding language,
    > > directly embedding mathematics or machine language--all of
    > > these are entirely outside of Unicode's purview and WG2's
    > > remit. They simply will not be adopted.
    > Well, the emoji is a new writing system and that is being encoded. The encoding of
    > the emoji has made me realize that the encoding of the portable interpretable
    > object code is not an impossibility.

    Emoji were not created, then suggested to Unicode so they'd be interchanged:
    they were created and promulgated, as plain text - not as if they were plain
    text, but actually as plain text - and then encoded in Unicode. If you are
    pinning your hopes on Emoji as a precedent, you are sadly unaware of even the
    most basic tenets of the Standard.

    > > Your enthusiasm may be commendable, but you're spending
    > > your energy developing something which is not appropriate
    > > for inclusion within Unicode.
    > Thank you for your first remark, yet whether the portable interpretable object
    > code is or is not appropriate for inclusion within Unicode is a matter that is
    > not decided at this time.

    Actually, it is. Whether you wish to accept it is the only undecided matter.

    > There was a time when emoticons were not regarded as appropriate for inclusion in
    > Unicode, yet they are now being encoded. That is an important precedent that what
    > is appropriate depends upon the circumstances at the time, not on what was the
    > policy previously.

    Emoticons (as emoji) are exchanged as plain text. The only consideration that
    changed was whether they should be considered as markup or not. Eventually, it
    became clear that they no longer do classify as markup, but as plain text.
    This was not a change inpolicy, it was a development in evidence.

    > Plane 12 is empty at present and I am unaware of any other plans for its use. Rather
    > than a phrase such as "not appropriate" being used I feel that the approach could be
    > that there is plane 12, someone is suggesting using it for a futuristic idea, so let
    > us have a look at the idea, let us study the idea and try to improve it so as to get
    > the best possible result and then, as long as it is possible to demonstrate that
    > implementing the idea will do no harm, let us implement it.

    Planes 15 and 16 are Private Use planes. Nobody cares what you do there.
    That's what they're for. The only thing that Unicode has to say about them is
    that they are for Private Use, and public use by private agreement. Just
    because there are no CURRENT plans for a plane does not mean they are open for
    you to do whatever the heck you want, just as you shouldn't use the empty
    spots in the Greek block. Reserved means "don't do anything here". If you want
    to fulfill your craziest desires, like writing in Klingon pIqaD, use the
    Private Use areas.

    > William Overington
    > 2 June 2010

    If you wish to discuss this further, please do so by private email, not on the

    Van Anderson

    This archive was generated by hypermail 2.1.5 : Wed Jun 02 2010 - 06:30:14 CDT