Regarding DBCS EBCDIC tables, are there any conversion tables to/from the
Windows CP's or do I have to triangulate the conversion through Unicode?
Thanks
Steve.
-----Original Message-----
From: <unicode@unicode.org >
Sent: 02 November 1998 06:43
To: Unicode List <unicode@unicode.org>
Cc: unicode@unicode.org
Subject: Re: EBCDIC Encoding question
Hello,
am 1998-10-26 um 13:19 h hat Julia Oesterle (Unicode) geschrieben:
> Can any EBCDIC people answer this fellows question?
Though I am not one of those "EBCDIC people", I can (as the local guru on
character encodings, and former EBCDIC user).
Am 1998-10-22 um 12:31 h hat Daniel Oppenheimer geschrieben:
> I am especially interested in converting between ASCII and EBCDIC.
Note that ASCII uses 7 bits per character, whilst EBCDIC uses 8 bits.
Hence, the mapping cannot be bijective.
Note also, that ASCII is a particular 7-bit code, viz. ISO 646 IRV,
whilst many vendors, and text-book authors, abuse the term "ASCII"
(or the similar term "ANSI") for a pletora of different encodings:
- MS-DOS abuses the term "ASCII" as a synonym for "text, in whatever
8-bit code currently is selected via the 'mode' command", usually
one of the IBM proprietary codes, CP 437 and CP 850;
- MS-Windows abuses the term "ANSI" for its proprietary 8-bit code,
CP 1252 (and perhaps also for other MS propritary codes, depending
on the current language setting),
- many internet encoding utilities abuse the term "ASCII" for the
8-bit code "Latin-1" (ISO 8859-1), or its predecessor, the DEC multi-
lingual terminal code.
> However, there appears to be more than one kind of EBCDIC.
Actually, there are 11 (or so) different EBCDICs for the Latin-1 character
set, currently supported (the so-called CECPs = "Country-Extende Code Pages",
if I am not mistaken), several other EBCDIC variants for other character
sets, and several hundred legacy EBCDIC variants.
> I am working on an encoding converter.
Before embarking on any serious work concerning EBCDIC, you should obtain
your copy of the latest "CDRA Level 1 Reference" (SC09-1390) and "CDRA
Level 1 Registry" (SC09-1391) from your nearest IBM representative.
> Could someone tell me the difference between EBCDIC 500 and open EBCDIC?
What do you mean by "open EBCDIC"?
You may find the following tables useful:
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP037.TXT>
English (US) CECP, also used in Canada, Netherlands, Portugal, Brazil,
Australia, and New Zealand
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP500.TXT>
Belgium, Switzerland, and International CECP
(this was meant to become "the" international CECP, but this attempt
has failed; meanwhile, CECP 1046 is the agreed standard)
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP875.TXT>
Greek EBCDIC
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP1026.TXT>
Turkish EBCDIC (Latin-5 set)
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT>
Windows code for Latin-1 countries (the "ANSI" misnomer)
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP437.TXT>
The "classic" IBM PC code -- but see below
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP850.TXT>
The "international" IBM PC codepage, containing (but not limited to)
the Latin-1 character set -- but see below
All mappings in <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/>
are subject to the correction outlined in
<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/IBM/README.TXT>.
All of these tables map various code pages to unicode, resulting in a common
descriptive framework for thoes distinct code pages.
Best wishes,
Otto Stolz
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT