VS: Euro Sign in 8859-15 (was: Re: Indian Rupee Sign to be chosen today)

From: Erkki I Kolehmainen (eik@iki.fi)
Date: Fri Jun 25 2010 - 16:25:13 CDT

  • Next message: Vincent Setterholm: "Generic Base Letter"

    Ken and others,

    At the time I was the European project team leader for the standardization
    of the euro, and as such I was strongly pushing for the addition of the euro
    sign to Latin-1, which could not be done without adding a new part, which
    then had to be done for the visibility. I fully agree with Ken (as he quite
    well knows, I trust) that no new character encoding standardization should
    have been done for quite a while on anything but the 10646/Unicode. As is,
    the use of any of the 8859 parts can no longer be really be justified for
    any purpose, and with 10646/Unicode the euro sign works extremely reliably.

    Sincerely, Erkki

    -----Alkuperäinen viesti-----
    Lähettäjä: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    Puolesta Kenneth Whistler
    Lähetetty: 25. kesäkuuta 2010 22:48
    Vastaanottaja: unicode@unicode.org
    Kopio: kenw@sybase.com
    Aihe: Euro Sign in 8859-15 (was: Re: Indian Rupee Sign to be chosen today)

    > On Fri, 25 Jun 2010, I wrote
    >
    > > Even in the year 2010, the euro sign (¤) doesn't work reliably.
    >
    > in both the Unicode list and in the newsgroup de.test.
    >
    > unicode.org shows a euro sign:
    > http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0372.html
    >
    > groups.google.com shows a currency sign:
    > http://groups.google.co.uk/group/de.test/msg/e027e91e7ef17f62

    And as the snark seems to be spreading about this, let's step
    into the Wayback Machine for a moment...

    When 8859-15 was originally proposed in 1997 (see SC2/WG3 N388R, for
    those of you with deep document archives), primarily to add the euro
    sign to an 8-bit character set (but also to "fix" 8859-1 for
    French and Finnish), the U.S. NB voted against the subdivision
    of work, claiming in the strongest of terms that the proposal
    was inherently flawed and simply would not work to solve the
    problem(s) it was addressed at.

    I'll quote at length from the U.S. NB comments in SC2 N2994,
    dated 1997-11-21, "Summary of Voting on SC 2 N 2910, Proposal for
    Project Subdivision of project JTC 1.02.20: a new part of ISO/IEC
    8859 for Latin Zero covering the EURO Symbol and Full Support for
    the French and Finnish Language":

    ================================================================

    The US disapproves a project subdivision for ISO/IEC 8859-15 for
    the following reasons:

    1) It is the US long stated position that additional parts of
    8859 should not be created, except to capture existing 8-bit
    practice (viz Part 11). Rather than addressing problems with
    particular solutions, which are extremely costly to implement,
    industry efforts should be focused on implementing
    comprehensive solutions via the support of ISO/IEC 10646.

    2) From document WG3 N 388 it is clear that the intent is to
    replace ISO 8859-1, for the same user community. Because of
    the prominent role that 8859-1 has gained as the default
    character set in many internet protocols, introducing a near
    equivalent standard will have disastrous effects. Due to their
    large intersection part 1 and part 15 would appear to inter-operate
    without proper adherence to announcing mechanisms. Were part 15
    accepted and widely implemented, the result would be that no one
    could be sure that ANY character from the non-intersecting part of
    each set can be used reliably. In many ways, this situation is
    reminiscent of the problems that plagued the 7-bit sets of ISO 646.

    3) The adoption of ISO/IEC 10646 by the vendor community is
    making rapid progress, therefore it cannot be argued that a
    flawed solution must be accepted for lack of practical
    alternatives.

    ================================================================

    It was already clear 13 years ago that 8859-15 wasn't going
    to work. It shouldn't be too surprising that 13 years later
    it still isn't working.

    As Mark indicated, the answer here is not to expect distributed
    systems to be able to reliably distinguish 8859-1 and 8859-15,
    when neither labelling nor heuristics for distinguishing them
    are reliable in the first place. The answer for reliable
    representation of the euro sign is to use UTF-8. And that answer
    was already obvious in 1997.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri Jun 25 2010 - 16:28:12 CDT