Character list for European and Canadian use in the revised keyboard standard ISO/IEC 9995-3, supplementing MES-1

From: Karl Pentzlin (karl-pentzlin@acssoft.de)
Date: Wed Sep 17 2008 - 17:46:29 CDT

  • Next message: Karl Pentzlin: "Erratum in: Character list for European and Canadian use in the revised keyboard standard ISO/IEC 9995-3, supplementing MES-1"

    On the ISO/IEC JTC 1/SC 35/WG 1 meeting in Naples/Italy last week
    (ending 2008-09-12), significant progress was achieved regarding
    the ISO/IEC 9995 standards group regarding keyboards.

    There were two resolutions regarding initiating of a new work item
    “ISO/IEC 9995-9 Multilingual, multiscript keyboard group layouts”.

    Having done this, the existing ISO/IEC 9995-3 "Complementary layouts
    of the alphanumeric zone of the alphanumeric section" will only be
    revised to accommodate the needs of its current users, which is
    outside of Europe (the original scope denoted by its referral to
    MES-1 []) only Canada.

    This will be done by allowing a choice between two secondary groups:
    the "outdated" one which is exactly the secondary group in the current
    edition of the standard, and the "current" group which is a revision
    of the former by adding characters to the unused positions in Level 3
    and some reordering (especially to allow the case pairs of added
    letters to be allocated to levels 1/2 of a key as usual).

    While the new ISO/IEC 9995-9 will take its time, the revised ISO/IEC
    9995-3 is planned to become FDIS at the next SC35 meeting in Berlin as
    of February 2009, providing a readily available solution for
    multilingual input at least for the Latin-writing parts of Europe as
    well as for Canada and maybe also for the USA and South Africa.

    The tentative list of characters to be added is as shown below.

    Any hints and comments are welcome.

    -- Karl Pentzlin

    --
    Tentative character list (diacritical marks are marked with "(D)"),
    supplementing MES-1 (Multilingual European Subset 1 of ISO 10646,
    i.e. character collection 281):
     A. Fulfilling European needs:
        ==========================
    U+20A0 EURO-CURRENCY SIGN
     (already included in ISO/IEC 9995-3)
     For Icelandic:
    U+00D0 LATIN CAPITAL LETTER ETH
     (This character is not the same as:
      U+0110 LATIN CAPITAL LETTER D WITH STROKE
      although it looks the same or similar in most fonts.)
      
     For Sami:
    U+01B7 LATIN CAPITAL LETTER EZH
    U+0292 LATIN SMALL LETTER EZH
    U+0335 COMBINING SHORT STROKE OVERLAY (D)
     (This diacritical character allows to input:
      U+01E4 LATIN CAPITAL LETTER G WITH STROKE
      U+01E5 LATIN SMALL LETTER G WITH STROKE
      by the means described in ISO/IEC 9995-3.
      These characters are not explicitly included, as there are other
      Latin letters with stroke to be enterable fulfilling Canadian
      needs in the same way, thus all not occupying a place in the
      layout table.)
     For Azerbaijani:
    U+018F LATIN CAPITAL LETTER SCHWA
    U+0259 LATIN SMALL LETTER SCHWA
     (Special note on the schwa: According to
      http://www.ssimicro.com/fonts/dene/keystrk.pdf , the Dene
      language of Saskatchewan (Canada) also uses the schwa, but lists
      a reversed E as uppercase form, like the Nigerian reversed E
      U+018E/U+01DD. Is it OK for Dene to consider this simply
      as a font variant, rather than having to add the reversed E
      which is problematic as the lowercase appearance of the
      letters are encoded differently but look the same?)
     For Vietnamese (serving the Vietnamese living in Europe):
    U+0309 COMBINING HOOK ABOVE (D)
    U+031B COMBINING HORN (D)
     For German and other languages:
    U+201A SINGLE LOW-9 QUOTATION MARK
    U+201E DOUBLE LOW-9 QUOTATION MARK
     For Norwegian:
    U+214D AKTIESELSKAB
     For writing German Fraktur (Blackletter) and Irish Gaelic
     (script variant with "old fashioned" look but with some
      contemporary use):
    U+017F LATIN SMALL LETTER LONG S
    U+027C LATIN SMALL LETTER R WITH LONG LEG
    U+200C ZERO WIDTH NON-JOINER
     (needed to prevent ligatures in Fraktur where forbidden by
      special German orthographic rules, especially at the border
      of word compounds but not regularly on syllable boundaries)
    U+204A TIRONIAN SIGN ET
     (besides common Gaelic use, also used in Fraktur for the
      "et" in the abbreviation "etc."; then commonly misnomed as
      "r rotunda" because its similarity to that medieval glyph
      variant of "r")
     For transliteration of many languages including Arabian,
     Hebrew, and Sanskrit, as in particular religious communities
     use to transliterate names occurring in their holy scripts
     correctly even in literature addressed to the general public:
     (transliterations according to ISO 233, DIN 31635, ISO 15919)
    U+02BE MODIFIER LETTER RIGHT HALF RING
    U+02BF MODIFIER LETTER LEFT HALF RING
    U+02C0 MODIFIER LETTER GLOTTAL STOP
    U+02C1 MODIFIER LETTER REVERSED GLOTTAL STOP
    U+02C8 MODIFIER LETTER VERTICAL LINE
    U+02CC MODIFIER LETTER LOW VERTICAL LINE
    U+0310 COMBINING CANDRABINDU (D) - no own key needed, enter as:
       "dot above" followed by "breve"
    U+0331 COMBINING MACRON BELOW (D)
    U+0324 COMBINING DIAERESIS BELOW (D) - no own key needed, enter as:
       "dot below" followed by "dot below" again
    U+0325 COMBINING RING BELOW (D)
    U+032E COMBINING BREVE BELOW (D) - no own key needed:
      as in the context of the intended application area of ISO/IEC
      9995-3 this diacritical mark is only used with the letter h/H,
      and as the common (above) "breve" is never used with that letter
      there, it simply is to be stated that if a "breve" is applied to
      the letter "h" or its uppercase form, it shall applied below that
      letter, thus yielding in fact a "combining breve below".
    U+0347 COMBINING EQUALS SIGN BELOW (D) - no own key needed, enter
       as: "combining macron below" followed by that key again
    U+035F COMBINING DOUBLE MACRON BELOW (D) - no own key needed,
       enter as: "combining low line" followed by that key again
       ("combining low line" see below at "fulfilling Canadian needs")
     Punctuation marks and symbols:
    U+2013 EN DASH
     ("long dash" as used in Germany and other countries)
    U+2014 EM DASH
     ("long dash" as used in English-speaking and other countries)
    U+2032 PRIME
    U+2033 DOUBLE PRIME
     (for "degree minute/second" and for measure "foot/inch")
    U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
    U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
     Elementary mathematical symbols:
    U+2248 ALMOST EQUAL TO - no own key needed, enter as:
      diacritical mark "tilde" + base character "tilde"
    U+2260 NOT EQUAL TO - no own key needed, enter as:
      diacritical mark "combining short solidus overlay" + "equals sign"
    U+2264 LESS-THAN OR EQUAL TO - no own key needed, enter as:
      diacritical mark "combining macron below" + "less-than sign"
    U+2265 GREATER-THAN OR EQUAL TO - no own key needed, enter as:
      diacritical mark "combining macron below" + "greater-than sign"
     B: Fulfilling Canadian needs, for aboriginal languages:
        ====================================================
        (preliminary)
     Confirmed use:
    U+019B LATIN SMALL LETTER LAMBDA WITH STROKE
     (Examples in Unicode document L2/05-194.
      Unicode does not yet contain an uppercase variant)
    U+019E LATIN SMALL LETTER N WITH LONG RIGHT LEG
    U+0220 LATIN CAPITAL LETTER N WITH LONG RIGHT LEG
    U+0222 LATIN CAPITAL LETTER OU
    U+0223 LATIN SMALL LETTER OU
    U+0241 LATIN CAPITAL LETTER GLOTTAL STOP
    U+0242 LATIN SMALL LETTER GLOTTAL STOP
    U+0294 LATIN LETTER GLOTTAL STOP
    U+02B7 MODIFIER LETTER SMALL W
     (Examples in Unicode document L2/05-194.
      Does the uppercase counterpart U+1D42 MODIFIER LETTER CAPITAL W
      also need to be included?)
    U+0332 COMBINING LOW LINE (D)
    U+0337 COMBINING SHORT SOLIDUS OVERLAY (D)
     (to generate several letters for the Sencoten language
      of Vancouver Island)
     Use to be confirmed:
    U+02BB MODIFIER LETTER TURNED COMMA
     (Used in fact for languages used in the USA, as
      Hawaiian and Samoan. Adding this letter makes ISO/IEC 9995-3
      applicable for the USA; at least I have not found yet any
      other Latin letters used for US aboriginal languages so far
      which are not contained in this list or already in MES-1.)
    U+02BC MODIFIER LETTER APOSTROPHE
    U+0313 COMBINING COMMA ABOVE (D)
    U+1D43 MODIFIER LETTER SMALL A
     (found in a font made for Dene and Lakota - if use is confirmed:
      Does the uppercase counterpart U+1D2C MODIFIER LETTER CAPITAL A
      also need to be included?)
     C. "Nice to have" (if there are gaps left):
        ========================================
     For South Africa:
    U+032D COMBINING CIRCUMFLEX ACCENT BELOW (D)
     (for Venda (Tshivenda), one of the 11 official languages of the
      Republic of South Africa. The contemporary spellings of the 10
      other ones, as the Khoisan click letters are replaced by
      combinations of standard Latin letters, only use characters
      which are already contained in MES-1. Thus, adding this
      character would make ISO/IEC 9995-3 applicable to South Africa.)
     Special hyphens and spaces:
    U+00A0 NO-BREAK SPACE
    U+202F NARROW NO-BREAK SPACE
    U+2011 NON-BREAKING HYPHEN
     For transliteration of Cyrillic names:
    U+02B9 MODIFIER LETTER PRIME
    U+02BA MODIFIER LETTER DOUBLE PRIME
     (the use of the similar looking symbols "prime" and "double
      prime" is possible in principle, although these are no letters)
     Other symbols:
    U+2026 HORIZONTAL ELLIPSIS
     For German special applications:
    U+1E9E LATIN CAPITAL LETTER SHARP S
     D. Not yet in ISO 10646, and use not yet confirmed:
        ================================================
     (the given Unicode values are a temporary replacement option)
    U+0398 Latin capital letter theta
    U+03B8 Latin small letter theta
     (possibly used for some variants of the Romany ("Gypsy")
      language orthography in Europe)
     (possibly used in Canada for an aboriginal language of Quebec)
    


    This archive was generated by hypermail 2.1.5 : Wed Sep 17 2008 - 17:51:52 CDT