Re: Greek chars encoded twice -- why?

From: John H. Jenkins (jenkins@apple.com)
Date: Thu Feb 18 2010 - 13:45:30 CST

  • Next message: Apostolos Syropoulos: "Re: Greek chars encoded twice -- why?"

    On Feb 18, 2010, at 12:27 PM, Apostolos Syropoulos wrote:

    >
    > 2010/2/18 <vanisaac@boil.afraid.org>
    >
    > Unicode is based on reality, not on hypothetical perfection. It is the result of compromises that place necessity over convenience,
    >
    >
    > Really? What kind of reality is this that ignores the rules of a language that uses a specific alphabet?

    The reality of the merger with ISO/IEC 10646. The Greek national body insisted on the inclusion of the current set of extended Greek.

    Unicode has a lot of stuff that was either an unpleasant necessity (at the time) or in retrospect a bad idea. And there are some things that would have been done differently if we'd had in the early 1990's the experience we have now.

    One lesson that Unicode learned in the school of hard knocks was that we can't change character names (however badly misspelled), and we can't get rid of characters. We did that in the past, and it was disastrous. So even though there are duplicate encodings or misleading compatibility encodings or things like that—and Greek is only the tip of the iceberg so far as such beasties go—we can't get rid of them. We can deprecate them and discourage their use, but that's about the extent of it.

    As for the bad case-mappings, because case mapping can be language-specific, the best course of action is to make sure the data in the CLDR is correct and to encourage clients to use the CLDR for case mapping (et al.). Unicode is not meant to be used in isolation.

    =====
    John H. Jenkins
    jenkins@apple.com



    This archive was generated by hypermail 2.1.5 : Thu Feb 18 2010 - 13:47:23 CST