RE: XML Blueberry Requirements

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Thu Jun 21 2001 - 13:24:12 EDT


> The only reason there's a problem here at all is because IBM
> tried to go it alone as a monopoly and set standards by fiat for years
> rather than working with the rest of the industry. Consequently their
> mainframe character sets don't really interoperate well with everybody
> else's character sets. In XML this arises as a problem with line endings
> when someone edits an XML document with an IBM mainframe text editor.
>
The problem is not a simple as it seems. ASCII was not widely adopted in
the late 50's when IBM started work on the 360 line. Badot and Hollorith
were much more widely used. They thought that a 7 bit system made no sense
and would soon die. They decided to go from BCD to EBCDIC a fully 8 bit
system. They felt that in addition to cr/lf that they need a single
character to do both.

It was PCs that made ASCII derived coding systems the current standard. But
is still is a problem when data is expected to be 7 bit clean. Other
systems such as Unix don't comply with the standard. Because they used dumb
terminals, it was too much to require a user to enter both a lf and cr.
Windows follows the ASCII standard but we accommodate Unix because it is a
major factor in today's market place.

However, I don't understand why IBM can not support ls (U+2028) and ps
(U+2029) if Windows can. The only issue that I can see is that they both
support a lf without cr. I guess the difference is that Windows does not
use lf without cr. If this is the problem it is a problem that can occur
with any ASCII system that use it as per the specs.

I have seen old applications that do use lf to position to a new line
without returning the carriage. These applications did so because it
printed faster. However most of these applications were phased out when
bi-directional printers were introduced. Some of these printers did not
position properly.

On the other hand if the problem is that there are two mappings for EBCDIC
0x15 and 0x25 it is an IBM problem of their own making. In fact using
U+2028 should make IBM's job easier. They can convert it to one or the
other characters depending on the specific EBCDIC encoding that they want.

Carl



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT