From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Mon Dec 01 2003 - 17:36:25 EST
I would like to point out one of the new features of ICU 2.8, which is currently available as an
alpha release: http://oss.software.ibm.com/icu/download/2.8/
ICU 2.8 has the ability to handle m:n character conversion mappings driven by simple lines in
Unicode conversion tables (text files).
I sincerely hope that the availability of this feature will help argue against further assignments
of precomposed Unicode characters.
For example, the ibm-1390_P110-2003.ucm conversion table file (for EBCDIC Japanese with the JIS X
0213 repertoire) contains lines like
<U304B><U309A> \xEC\xB5 |0
which expresses the mapping between two Unicode code points (Hiragana Ka + semi-voiced mark) and one
DBCS sequence.
Either side of the mapping can contain multiple "characters" - Unicode code points on one side,
complete codepage byte sequences on the other.
Best regards,
markus
This archive was generated by hypermail 2.1.5 : Mon Dec 01 2003 - 18:16:54 EST