[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #10906(accepted charts)

Opened 7 months ago

Last modified 6 months ago

/charts/keyboards/layouts/: Editorial feedback

Reported by: Marcel Schneider <charupdate@…> Owned by: mark
Component: keyboards Data Locale: any
Phase: dsub Review:
Weeks: Data Xpath: /charts/32/keyboards/layouts/


The tables representing the keyboard layout charts should have a table header row containing the ISO column numbers. This will also make for an equal cell width.

The modifier labels should be titlecased, and the left/right should be a prefix, not a suffix. The right Alt (AltGr) key can be labeled RAlt, but it CANNOT (seriously) be labeled “altR”!

(Wrong example)


Change History

comment:1 Changed 7 months ago by Marcel Schneider <charupdate@…>

Designing layout charts

Designing layout charts on a one-character-per-key basis is inefficient. That representation is biased by the usual display of layout editors with a graphic UI. It is also influenced by the XML way of defining layout levels (keyMaps).

By contrast, in C (and in derived formats such as KLC), a tabular format is used to define layouts. In C, keys are even grouped together in tables depending on the number of levels they are mapped on.

That is inspiring. To make for easily legible layout charts, the fix is to group four levels into one chart by dividing keycaps into four colored fields, with tooltips for the code point and additional information. See examples linked from ticket 10851.

Representing dead keys

XML layout definitions as used in macOS *.keylayout files represent dead keys in one mandatory way: All single definitions must be grouped by base characters. Every <action> element must have a unique identifier, and it is referred to by one or more <key> elements in one or more <keyMap> elements. By contrast, the Windows way of defining dead keys is by grouping them together under headings showing the code of the dead character. However, this is found in *.klc files only. The C source uses a flat listing, where every sequence of dead character and base character is in the arguments of a function called DEADTRANS, or DEADTR (custom name), and the list can be sorted by either. XKB uses multikey sequences listed likewise, where dead keys are represented by their name in pointy brackets.

The <transform> element used by actual LDML might be appropriate for XKB, and it might claim a similarity with Windowsʼs deadtransform function, but it is misleading and inappropriate as a normative representation, and it is inefficient for up-to-date keyboarding in numerous locales. And its representation in LDML using spacing diacritics to represent dead keys is plain wrong, as it omits to represent the dead key property. E.g., both '^' and 'ê' could be used to represent a dead key in LDML, though both are used independently in isolation as spacing characters. That can be fixed by a notational convention using single angle quotation marks to bracket dead characters and dead key names. E.g. ‹^›, ‹ê›, ‹circumflex accent›, ‹circumflex›, ‹circum›, ‹asciicircum›. XKB uses <dead circumflex>, <dead acute>, and so on.

Such notational conventions however donʼt suffice for a streamlined and unambiguous XML representation of keyboard layouts in CLDR. The fix is to refer to dead keys as selectors. See ticket 10898.

LDML syntax additions

In LDML, we need to add the selector argument in the <map> element. It replaces the to argument where it is present, and its value is the name of the dead key, like in the <action> element on macOS, but it is hanlded differently. Instead of the <actions> element of macOS — that does not exist in LDML — there is then a <selectors> element, whose childs are the <selector> elements. These have the name argument, that is the dead key name, and their childs are <map> elements (instead of <when> elements). Like when occurring in the <keyMap> element, the <map> elements have their iso argument, and may have a to (output) argument and/or a selector argument. The latter is for support of chained dead keys (also called serial dead keys), and if present along with to, for support of iterative dead keys (used like the next argument, see ticket 10851). Unlike <when> elements, <key> elements donʼt need a state argument, since that is implied by their parent grouping them together.

Note that this is no real keyboard implementation, but a neutral representation of functionality as required for CLDR. The above is proposed LDML syntax only.

comment:2 Changed 7 months ago by Marcel Schneider <charupdate@…>

Not exactly… See ticket #10901 for an update.

comment:3 Changed 7 months ago by Marcel Schneider <charupdate@…>

Please refer to comment 6 there.

comment:4 Changed 6 months ago by mark

  • Owner changed from anybody to mark
  • Status changed from new to accepted

comment:5 Changed 6 months ago by Marcel Schneider <charupdate@…>

Copy-pasting comment from #10901:

Part of proposed edits in UTS #35-7 are in online draft TR

Basic edits suggested so far (and some more) are now implemented in a draft version of this part of UTS #35.
But major edits such as most syntax extensions are not yet:


Both source code and style sheet can be downloaded via the default index page.

However please note that work is still in progress.

comment:6 Changed 6 months ago by Marcel Schneider <charupdate@…>

Referencing key positions

The proposed representation of four key positions per key in a key map chart brings the need to clearly define how they will then be referenced.

Ruling out the idea of the Shift + AltGr key combination as a remnant generic group selector (access to the common secondary group per ISO/IEC 9995) and conceptualizing the group selector dead key (see #10898) brings the need to streamline the way key positions to the right are referred to.

  • The ISO way is to consider AltGr as level 3 and to cast it under the pile of Base and Shift shift states. Shift+AltGr is inexistent as a layout level in ISO/IEC  9995.
  • The legacy way is to refer to AltGr as level 3 and to Shift+AltGr as level 4, and to represent them as two piles of two levels.

To make the level number expressive as of whether Shift is pressed, the proposed way is to always call a level “1” when Shift is up (and Caps Lock is off [or Caps Lock is on and Shift is down]), and to call it “2” when Shift is down (or Caps Lock is on).

As of the right side of the key, it was proposed to call it “pile 2”. But the drawbacks are numerous:

  • The term is uncommon on keyboards and is not immediately understandable in the proposed context.
  • The references get lengthy: “Level 1 pile 2 group 1”.
  • The term presents a localization challenge.
  • Making the right key half a part of its own maintains the temptation to backlabel it as “group.”

We now propose to simply say “level 1a, level 2a, level 1b, level 2b.”

The advantage is the brevity and familiarity of the scheme, similar to the one used in spreadsheets, but reversed to avoid confusion with ISO key numbers.

comment:7 Changed 6 months ago by Marcel Schneider <charupdate@…>

Proposed revision of part 7 of UTS #35

The primary edits are completed. The proposed revision is now posted here:


(Old URL redirects here.)

comment:8 Changed 6 months ago by Marcel Schneider <charupdate@…>

Including the numeric keypad in the layout charts

The Belgian fr-BE (period) keyboard differs from the regular one in having the period on the numpad decimal key instead of the comma. This does not show up in CLDR neither in the charts nor in the tags, due to the exclusion of the numpad. MSKLC allows to edit that single key (unlike the other keys of the numpad). Editing the rest is done by editing the source code. It is necessary for CLDR to include the numeric keypad. To keep charts minimal, the decimal separator key may be nested in row A, column 13.

The _platform.xml file should be completed exhaustively by that occasion, because complete keyboard layouts encompass the numpad to enhance user experience.

Linux already has layouts mapping graphic arrows on the AltGr levels (AltGr + Numpad yields single arrows, Shift+AltGr + Numpad yields double arrows).

On Windows, the Shift level is available after disabling the legacy cursor move and edit functionalities (done by adding some code in the driver header, see sample header file linked on the page cited hereafter: kbcommon.h(407, 430)), and it is easily usable either by holding down the lefthand Shift key for specific keypresses (double and triple zeroes, thousands separator [NNBSP or comma/period], prefixes, hex letter digits, styled arithmetical operators), or by enabling CapsLock (notably to toggle between period and comma). Programmer Lock also may usefully affect some of those keys (more ASCII operators and parentheses even without extra hardware keys, other and more prefixes, lowercase hex letter digits instead of uppercase if desirable). See sample layout on http://charupdate.info/doc/kbenintu/

Some of this has been filed also as a new ticket wrt Belgian locale. See ticket #10960: Belgian keyboards need disambiguation.


Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.