Re: Why is the Hyphen property stabilized?

From: Kenneth Whistler (
Date: Wed Oct 14 2009 - 14:37:30 CDT

  • Next message: Kenneth Whistler: "Re: kGB0 syntax"

    Karl Williamson asked:

    > I'm researching why this property got stabilized, and I can't find
    > anything on it explicitly;

    The short answer is because the UTC decided not to maintain
    it further, after Unicode 3.2.

    The long answer is that during 2002 and 2003 there were a whole
    series of complex issues debated by the UTC regarding hyphens:
    what properties to use for the soft hyphen (U+00AD), what to do for
    the Mongolian Todo "soft" hyphen (U+1806), issues with the
    Armenian hyphen (U+05BA), an extended debate about just what
    "hyphens" really were, and their relationship to the ongoing
    development of UAX #14, and so on. Look at all the new text on
    hyphens and soft hyphen in particular that was added to Version
    4.0.0 of UAX #14.

    Because UAX #14 has elaborate discussion of hyphens and hyphenation,
    and the Line_Break property has its own values related to
    hyphenation (lb=BA versus lb=HY), unconnected to the older
    property defined in PropList.txt, the UTC decided it just didn't
    make any sense to keep trying to determine values for the
    unused Hyphen property for new characters added to the standard.

    The straw that broke this particular camel's back -- in addition
    to the year-long discussion of properties for U+00AD SOFT HYPHEN,
    was the ongoing discussion about the encoding of what eventually
    was approved (for Unicode 4.1) as U+2E17 DOUBLE OBLIQUE HYPHEN.
    That was the first addition of a character that would have been
    given the Hyphen=True property value, but which wasn't, because
    the UTC had stabilized the property as of its Unicode 3.2 values.

    > and what alternatives one is supposed to use.

    UAX #14 and the Line_Break property, if you want to do anything
    useful with characters behaving as hyphens.


    > Language in the uax44 predecessor documents for a while said that
    > these properties have not been found useful in practice. But that
    > language was eventually deleted.

    This archive was generated by hypermail 2.1.5 : Wed Oct 14 2009 - 14:39:35 CDT