Re: 8859-1, 8859-15, 1252 and Euro

From: Tony Harminc (tzha1@ibm.net)
Date: Thu Feb 10 2000 - 19:38:35 EST


On 10 Feb 00, at 8:09, Erik van der Poel wrote:

> There is a boundary between mainframes and the Internet. There is a
> gateway at that boundary. The gateway should take care of the octets
> in the C1 range, so that the big mainframe doesn't choke on the data
> produced by the little PC. The gateway will need to do this for UTF-8
> *anyway*, so it might as well do it for windows-1252 too.

Erik, you seem to have what I can only call an absurd view of how
mainframes work. Perhaps some vast water cooled boxes in a glass-
walled room full of spinning magnetic tapes, tended to by men in
white lab coats, and all running on COBOL and punched cards?

Come on - it's 2000, and things have changed just a little bit.

There is no "gateway" between mainframes and the Internet. of course
most mainframes, like most other computers, are protected by a
firewall of some sort, but the mainframe has a TCP/IP stack like any
other machine, and exchanges packets with the rest of the world.

Mainframes do not "choke" on byte values outside the valid character
range of a particular codepage. Any byte value is treated like any
other - indeed old-time mainframe programmers find the idea that a
zero byte in a data stream should have special significance very
strange.

The only reason for all this trouble over misuse of the C1 controls
is because the architecture of a particular IBM display terminal, the
3270, required that the range 00-3F and FF were control characters.
The 3270 is probably the most popular terminal in history
(conceivably second to one of the DEC VT models, but I doubt it), and
the legacy of its architecture has stayed with us even though the
terminals themselves are mostly museum pieces. But there is nothing,
I repeat *nothing* in any other part of mainframe architecture that
prevents bytes or characters of any value from being handled, stored
in files, sent to screens, displayed on X-Windows terminals, etc. etc.

> On Unix, these C1 octets can be mapped to appropriate glyph codes, if
> the user has installed some of the more modern X fonts, such as
> *-iso10646-1. The Unix apps are still somewhat behind perhaps, but
> they will eventually catch up or die. We are planning to make some
> changes to Unix Mozilla 5.0 to deal with these windows-1252 characters
> (and UTF-8 too of course).

Here again I sense a misunderstanding. UNIX does not imply ASCII.
The standard codepage for UNIX on OS/390 is 1047.

Tony H.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT