Date of last update: 1999-10-08
Revision history:
1. 1999-10-07: created for Unicode3.0
2. 1999-10-08: editorial changes
This file consists of tables with links to mapping data files available. For the most current information please refer to the Unicode ftp site for mapping data (ftp://ftp.unicode.org/Public/MAPPINGS/).
Character encoding |
Mapping to Unicode/UTF-16 |
Date of last update |
Remark |
UTF-16 |
Identity |
|
UCS-2 extended to planes 0-16. |
UTF-8 |
Given by algorithm (normative) |
|
In Unicode UTF-8 is limited to planes 0-16. |
SCSU |
Given by algorithm (UTR 6) |
|
Standard Compression Scheme for Unicode |
UTF-32 |
Given by algorithm (UTR 19) |
|
UCS-4 limited to planes 0-16. |
|
|
|
|
ISO/IEC 646:1991-IR |
(By implicit algorithm) |
|
7-bit ASCII; US-ASCII. |
ISO/IEC 646-SE/FI |
|
|
|
ISO/IEC 646-DK/NO |
|
|
|
ISO/IEC 646-DE |
|
|
|
ISO/IEC 646-FR |
|
|
|
ISO/IEC 646-IT |
|
|
|
ISO/IEC 646-ES |
|
|
|
|
|
|
|
ISO/IEC 6937:1994 |
|
|
Note that combining characters are stored before the base character for ISO/IEC 6937. |
|
|
|
|
ISO/IEC 8859-1:1998 |
1999 July 27 |
Latin-1 |
|
ISO/IEC 8859-2:1999 |
1999 July 27 |
Latin-2 |
|
ISO/IEC 8859-3:1999 |
1999 July 27 |
Latin-3 |
|
ISO/IEC 8859-4:1998 |
1999 July 27 |
Latin-4 |
|
ISO/IEC 8859-5:1999 |
1999 July 27 |
Latin/Cyrillic |
|
ISO/IEC 8859-6:1999 |
1999 July 27 |
Latin/Arabic |
|
ISO/IEC 8859-7:1987 |
1999 July 27 |
Latin/Greek |
|
ISO/IEC 8859-8:1999 |
1999 July 27 |
Latin/Hebrew |
|
ISO/IEC 8859-9:1999 |
1999 July 27 |
Latin-5 |
|
ISO/IEC 8859-10:1998 |
1999 July 27 |
Latin-6 |
|
ISO/IEC 8859-11 |
|
|
Latin/Thai |
12 |
|
|
Unused 8859 part number |
ISO/IEC 8859-13:1998 |
1999 July 27 |
Latin-7 |
|
ISO/IEC 8859-14:1998 |
1999 July 27 |
Latin-8 |
|
ISO/IEC 8859-15:1999 |
1999 July 27 |
Latin-9 (Latin-1 replacement) |
|
ISO/IEC 8859-16 |
|
|
Latin-10 |
|
|
|
|
Character encoding |
Mapping to Unicode/UTF-16 |
Date of last update |
Remark |
Mac OS Arabic |
1999-Sep-22 |
|
|
Mac OS Central European |
1999-Sep-22 |
|
|
Mac OS Chinese Simplified |
1999-Sep-22 |
|
|
Mac OS Chinese Traditional |
1999-Sep-22 |
|
|
Mac OS Croatian |
1999-Sep-22 |
|
|
Mac OS Cyrillic |
1999-Sep-22 |
|
|
Mac OS Devanagari |
1999-Sep-22 |
|
|
Mac OS Farsi |
1999-Sep-22 |
|
|
Mac OS Greek |
1999-Sep-22 |
|
|
Mac OS Gujarati |
1999-Sep-22 |
|
|
Mac OS Gurmukhi |
1999-Sep-22 |
|
|
Mac OS Hebrew |
1999-Sep-22 |
|
|
Mac OS Icelandic |
1999-Sep-22 |
|
|
Mac OS Japanese |
1999-Sep-22 |
|
|
Mac OS Korean |
1999-Sep-22 |
|
|
Mac OS Roman |
1999-Sep-22 |
|
|
Mac OS Romanian |
1999-Sep-22 |
|
|
Mac OS Thai |
1999-Sep-22 |
|
|
Mac OS Turkish |
1999-Sep-22 |
|
|
Mac OS Ukrainian |
1999-Sep-22 |
||
|
|
|
|
NEXTSTEP Encoding |
1999 September 23 |
|
|
|
|
|
|
CP 10007 MacCyrillic |
04/24/96 |
||
CP 10006 MacGreek |
04/24/96 |
||
CP 10079 MacIcelandic |
04/24/96 |
||
CP 10029 MacLatin2 |
04/24/96 |
||
CP 10000 MacRoman |
04/24/96 |
||
CP 10081 MacTurkish |
04/24/96 |
Character encoding |
Mapping to Unicode/UTF-16 |
Date of last update |
Remark |
CP 874 |
02/28/98 |
Latin/Thai |
|
CP 932 |
04/15/98 |
MS Shift-JIS |
|
CP 936 |
04/15/98 |
MS Chinese (Simpl.) |
|
CP 949 |
04/15/98 |
MS Korean |
|
CP 950 |
04/15/98 |
MS Big-5 (Trad. Chinese) |
|
CP 1250 |
04/15/98 |
|
|
CP 1251 |
04/15/98 |
Latin/Cyrillic |
|
CP 1252 |
04/15/98 |
Extends on ISO/IEC 8859-1 Latin-1 |
|
CP 1253 |
04/15/98 |
Latin/Greek |
|
CP 1254 |
04/15/98 |
|
|
CP 1255 |
04/15/98 |
Latin/Hebrew |
|
CP 1256 |
01/5/99 |
Latin/Arabic |
|
CP 1257 |
04/15/98 |
|
|
CP 1258 |
04/15/98 |
|
See also the IBM README file (vendors/ibm/readme.txt) on encoding mappings.
Character encoding |
Mapping to Unicode/UTF-16 |
Date of last update |
Remark |
CP 437 Latin (US) |
04/24/96 |
|
|
CP 737 Greek (A) |
04/24/96 |
|
|
CP 775 BaltRim |
04/24/96 |
|
|
CP 850 Latin (A) |
04/24/96 |
|
|
CP 852 Latin (B) |
04/24/96 |
|
|
CP 855 Cyrillic (A) |
04/24/96 |
|
|
CP 857 Turkish |
04/24/96 |
|
|
CP 860 Portuguese |
04/24/96 |
|
|
CP 861 Icelandic |
04/24/96 |
|
|
CP 862 Hebrew |
04/24/96 |
|
|
CP 863 Canada F |
04/24/96 |
|
|
CP 864 Arabic |
04/24/96 |
|
|
CP 865 Nordic |
04/24/96 |
|
|
CP 866 Cyrillic (B) |
04/24/96 |
|
|
CP 869 Greek (B) |
04/24/96 |
|
|
CP 874 Thai |
04/15/98 |
Non-ISO encodings on Unixes, Adobe's, non-MS PC, GSM/SMS, RDS, ...
Character encoding |
Mappingt to Unicode/UTF-16 |
Date of last update |
Remark |
ETSI 7-bit default alphabet |
|
|
GSM/SMS (UCS-2 can also be used for GSM/SMS) |
|
|
|
|
Adobe Standard Encoding |
30 March 1999 |
||
|
|
|
|
IBM CP 1006 |
1999 July 27 |
ASCII+Arabic |
|
CP 856 |
1999 July 27 |
ASCII+Hebrew |
|
KOI 8-R (RFC 1489) |
18 August 1999 |
ASCII+Cyrillic |
|
|
|
|
|
JIS X 0201 (1976) |
8 March 1994 |
|
|
Shift-JIS |
8 March 1994 |
|
|
Johab |
08/16/99 |
|
See also:vendors/ibm/readme.txt.
Character encoding |
Mapping to Unicode/UTF-16 |
Date of last update< |
Remark |
UTF-EBCDIC |
Given by algorithm (UTR 16) |
|
Only for use where EBCDIC is required. |
|
|
|
|
IBM EBCDIC CP 424 (Hebrew) |
1999 July 27 |
|
|
|
|
|
|
CP 037 IBM US Canada |
04/24/96 |
|
|
CP 500 IBM International |
04/24/96 |
|
|
CP 875 IBM Greek |
04/24/96 |
|
|
CP 1026 IBM Latin-5 Turkish |
04/24/96 |
|
East Asian without ASCII/EBCDIC, symbol, dingbat, corporate zone, character entities, cross-references, ...
(Character encoding) |
(Mapping to Unicode/UTF-16) |
Date of last update |
Remark |
IBM PC memory-mapped video graphics |
1999 July 27 |
|
|
|
|
|
|
SGML character entities |
25 July 1997 |
|
|
|
|
|
|
Adobe Symbol Encoding |
30 March 1999 |
||
Adobe Zapf Dingbats Encoding |
30 March 1999 |
|
|
|
|
|
|
Registry of Apple use of Unicode corporate-zone |
1999-Sep-22 |
Registry, not a mapping |
|
Mac OS Dingbats |
1999-Sep-22 |
|
|
Mac OS Symbol |
1999-Sep-22 |
|
|
|
|
|
|
TCVN-NSCII Stack 1.0 HyperCard stack |
|
||
Unicode Han Character Cross-Reference |
14 March 1994 |
|
|
Unihan database |
23 September 1996 |
|
|
|
|
|
|
Korean Hangul Encoding Conversion |
Oct 04, 1995 |
|
|
KS C 5601 |
6 December 1993 |
Note: For Unicode 1.1! Obsolete! |
|
Unified Hangeul (KS C 5601-1992) |
07/24/95 |
For Unicode 2.0 and onwards. |
|
Unified Hangul (KS X 1001) |
08/16/99 |
|
|
|
|
|
|
JIS X 0208 (1990) |
8 March 1994 |
|
|
JIS X 0212 (1990) |
8 March 1994 |
|
|
|
|
|
|
GB 12345-80 |
6 December 1993 |
|
|
GB 2312-80 |
6 December 1993 |
|
|
|
|
|
|
BIG5 |
11 February 1994 |
|
|
CNS 11643-1986 |
21 October 1994 |
|