L2/01-006

Title: Reply to Georgian State Department of Information Technology

Source: Unicode Technical Committee

Author: Lisa Moore, Chair, Unicode Technical Committee

Distribution: David Tarkhan-Mouravi, Chairman, Georgian State Dept. of IT

Mark Davis, President, Unicode Consortium

Arnold Winkler, Vice Chair, Unicode Technical Committee

Mike Ksar, Convener, JTC1/SC2/WG2

Action: For Review and Response

Date: December 22, 2000

The Unicode Technical Committee (UTC) wishes to thank Chairman David Tarkhan-Mouravi for his letter giving recent resolutions in Georgia on the encoding of historical and modern Georgian characters. We have studied this letter and the accompanying code chart and have discussed them within the committee. Based on the committee’s understanding of your correspondence, we have the following comments and requests for further information.

1. Ordering. The proposed code chart shows a significant reordering of the Georgian characters from their current encoding in the Unicode Standard. Please be advised that we cannot accommodate this request. Neither the Unicode Standard nor ISO 10646 ever moves characters once they have been encoded. This is the policy of both standards bodies because it is of utmost concern to the implementers of these standards that the character assignments be stable.

The correct ordering of an encoding of Georgian characters is of importance to the Georgian National Standards body, but for an international character encoding standard which represents the characters of many scripts, languages, and countries, correct ordering is achieved through collation tables. Proper collation is never achieved solely through character encoding order. There are other ISO and Unicode standards that address collation: ISO/IEC 14651 and the Unicode Collation Algorithm. In both of these standards the order preferred by the Georgian State Department of IT has been respected.

While the ordering of a character encoding may seem like a problem when considering interoperability between Unicode and 8-bit implementations that follow de facto practice or national standards, in actuality we have not seen this to cause any difficulties. The appropriate technical solution is the application of mapping tables. As long as there is a one-to-one mapping of characters between a Georgian encoding standard and the Unicode Standard, then Georgian characters can always be mapped without loss of data.

2. Additional characters. The document we received shows three mkhedruli characters at 10F7, 10F8, and 10F9 that are not currently encoded in the Unicode Standard. The UTC is quite willing to entertain a proposal for these characters, but we would need more information. For new character proposals we ask that the requesters consider Unicode encoding guidelines found at http://www.unicode.org/pending/proposals.html, and submit a WG2 Proposal Form (http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/form1.html). In addition, we request examples of usage in text along with proposed names for the new characters.

3. Nuskhuri. The UTC fully understands that Information Technology in Georgia requires all three forms of Georgian: asomtavruli, nuskhuri, and mkhedruli. However, we do believe that all three forms can be properly addressed in the current Unicode encoding by using appropriate fonts to represent the nuskhuri form. We would be interested in hearing from the Georgian Department of IT why a solution that can represent asomtavruli, nuskhuri, and mkhedruli with the current Unicode encoding and an appropriate set of fonts might not be adequate for Georgian requirements.

Finally, please be aware that both the Unicode Technical Committee and the ISO JTC1/SC2 Working Group 2 will need to independently consider your proposals, as both standards bodies are committed to maintaining synchrony between the Unicode Standard and ISO 10646. If you can forward a response to the UTC before January 27, 2001, we will be most happy to discuss the subject again during the January 29 to February 1, 2001 UTC meeting.