Re: Revised N2586R

From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Jun 26 2003 - 16:23:34 EDT

  • Next message: John Hudson: "Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)"

    Doug, Peter, and Michael already provided good responses to
    this suggestion by William O, but here is a little further
    clarification.

    > Well, certainly authority would be needed, yet I am suggesting that where a
    > few characters added into an established block are accepted, which is what
    > is claimed for these characters, there should be a faster route than having
    > to wait for bulk release in Unicode 4.1. If these characters have been
    > accepted, why not formally warrant their use now by having Unicode 4.001
    > and then having Unicode 4.002 when a few more are accepted?

    Approvals aren't *finished* until both the UTC and ISO JTC1/SC2/WG2 have
    completed their work. The JTC1 balloting and approval process is
    a lengthy and deliberate one, and there are many precedents where a
    proposed character, perhaps one already approved by the UTC, has
    been moved in a subsequent ballotting in response to a national
    body comment. Only when both committees have completed all approvals
    and have verified they are finally in synch with each other, do they proceed
    with formal publication of the *standardized* encodings for the
    new characters.

    The reasons the UTC "approves" characters and posts them in the
    Pipeline page at www.unicode.org in advance of the actual final
    standardization are:

      A. To avoid the chicken and the egg problem for the two
         committees. Someone has to go first on an approval, since
         the committees do not meet jointly. Sometimes the UTC
         goes first, and sometimes WG2 goes first.
         
      B. To give notice to people regarding what is in process and
         what stage of approval it is at. This helps in precluding
         duplicate submissions and also helps in assigning code points
         for new characters when we are dealing with large numbers
         of new submissions.

    > These minor
    > additions to the Standard could be produced as characters are accepted and
    > publicised in the Unicode Consortium's webspace.

    The UTC can and does give notification regarding what characters have
    reached "approved" status. The Pipeline page at www.unicode.org is,
    for example, about to be updated with the 215 new character approvals
    from the recent UTC meeting.

    > If the characters have not
    > been accepted then they cannot be considered ready to be used, yet if they
    > have been accepted, what is the problem in releasing them so that people who
    > want to get on with using them can do so?

    See above. Standardization bodies must move deliberately and
    carefully, since if they publish mistakes, everybody is saddled
    with them essentially forever. In the case of encoding large
    numbers of additional characters, because the UTC has plenty of
    experience at the kind of shuffling around that may occur while
    ballotting is still under consideration, it would be irresponsible
    to publish small revisions and encourage people to start using
    characters that we know have not yet completed all steps of
    the standardization process.

    > Why is it that it is regarded by the Unicode Consortium
    > as reasonable that it takes years to get a character through the committees
    > and into use?

    Because with the experience of four major revisions of the Unicode
    Standard (and numerous minor revisions) and the experience of
    three major revisions of ISO/IEC 10646 (and numerous individual
    amendments) under out belt, we know that is how long it takes in
    actual practice.

    > The idea of having to use the
    > Private Use Area for a period after the characters have been accepted is
    > just a nonsense.

    Please take a look at:

    http://www.unicode.org/alloc/Caution.html

    which has long been posted to help explain why character approval
    is not just an instantaneous process.

    The further along a particular character happens to be in
    the ISO JTC1 approval process, the less likely it is that it will
    actually move before the standard is actually published.
    Implementers can, of course, choose whatever level of risk
    they can handle when doing early implemention of provisionally approved
    characters which have not yet been formally published in
    the standards. But if they guess wrong and implement a
    character (in a font or in anything else) that is moved at
    some point in the ballotting, then that was just the risk they
    took, and they can't expect to come back to the committees
    bearing complaints and grievances about it.

    If you, for example, want to put U+267F HANDICAPPED SIGN
    in a font now, nobody will stop you, but bear in mind that
    this character is only at Stage 1 of the ISO process -- it
    has not yet been considered or even provisionally approved
    by WG2. Not only is the name likely to change (based on
    all the issues already discussed), but it is conceivable
    that WG2 could decide to approve it at some other code position
    instead. It is even conceivable that WG2 could *refuse* to
    encode the character. There have been precedents, where a
    UTC approved character met opposition in WG2, and the UTC
    later decided to rescind its approval in favor of maintaining
    synchronization of the standards when published.

    --Ken

    --Ken



    This archive was generated by hypermail 2.1.5 : Thu Jun 26 2003 - 17:02:55 EDT