Re: Defined Private Use was: SSP default ignorable characters

From: Mark Davis (mark.davis@jtcsv.com)
Date: Thu Apr 29 2004 - 10:35:08 EDT

  • Next message: Mark Davis: "Re: conditional case mappings"

    I will repeat a few observations from email a month ago on the issue of PUA
    properties (or adding more PUA characters).

    And will also repeat, for the nth time, that discussion on this list is has
    absolutely zero effect unless there is a concrete, well-thought-out, written
    proposal that is submitted to the UTC via the normal document submission
    process. If people are interested in seeing some action done, instead of just
    blathering on and on, they will get together and up with such a proposal.

    Mark

    ----- Original Message -----
    From: "Mark Davis" <mark.davis@jtcsv.com>
    Cc: <unicode@unicode.org>
    Sent: Wed, 2004 Mar 31 15:27
    Subject: Re: What is the principle?

    > While I disagree with most of what you've said on this list, it is not an
    > unreasonable proposal to change the default properties for some ranges of the
    > private use blocks. I don't think that this would, in practice, really disturb
    > any applications, because of #1 below.
    >
    > I have, however, a few observations.
    >
    > 1. PUA properties, as is clear from Ken's excellent descriptions, are simply
    > defaults. With the exception of normalization, no Unicode implementation is
    > required to observe them. So even if this change is made, any conformant
    > implementation is free to simply ignore it and just assign its own properties.
    > This would not be a magic wand.
    >
    > 2. Unicode properties are not sufficient for rendering. With technologies such
    > as Apples, all of the other work can be done in a font. With OpenType, most
    but
    > not all can -- in particular, reordering has to be done by the application/OS.
    > So complex scripts that require reordering still would not be interchangeable
    > without private agreement.
    >
    > 3. Even excluding the normalization properties and other obvious inapplicable
    > properties (such as name or age), there are some 50-odd possible character
    > properties, many of them with multiple possible values: see
    >
    > http://www.unicode.org/Public/UNIDATA/PropertyAliases.txt
    > http://www.unicode.org/Public/UNIDATA/UCD.html#Properties
    > http://www.unicode.org/Public/UNIDATA/PropertyValueAliases.txt
    >
    > A concrete proposal would have to specify exactly which properties were
    > relevant, and what the values are for the proposed ranges. (Clearly an even
    > partition according to all the possible combinations would be completely
    > impractical.) If the goal is rendering, this means looking at the possible
    > combinations of properties that are relevant for rendering and proposing a
    > division that makes sense.
    >
    > Mark

    ...

    Re: exchange of PUA property data:

    > ANY dynamic reassignment of properties requires a major overhaul. There have
    > been proposals over the years for exchange of PU property data. All of them
    have
    > died, and I never expect to see any succeed.
    >
    > The reason is that most implementations just get properties with static calls,
    > e.g. isLetter(x). To change it to be dynamic, all of these calls in all
    programs
    > would have to be changed to reference a dynamic collection of properties. In a
    > single-threaded world, this wouldn't be too bad. But that is not our world --
    > which is a multi-threaded world -- there it is nasty; and horrible if the same
    > document is expected to contain different sets of PU properties. There are
    also
    > performance implications, since properties are used so heavily in processing.
    >
    > These are not whims of software vendors; they would be very expensive
    retrofits
    > for essentially no benefit.

    Re: which properties are needed for rendering (speaker said bidi and default
    ignorable)

    > Those alone won't work. If you want stuff to render right, then you have to
    > include *any* property that systems may use to affect display. You do want
    these
    > characters to linebreak correctly, eh? That's why I said that a complete
    > proposal would have to spell out all the properties would be considered, and
    > give reasons for the inclusion/exclusions.

    Re: adding more PUA characters with different properties:

    > There is no way I would advocate adding even more PU characters; the number
    > we have is wasteful as it is. (In hindsight, we shouldn't have gone beyond
    > U+FFFFF in any event.)



    This archive was generated by hypermail 2.1.5 : Thu Apr 29 2004 - 11:31:06 EDT