Hacking Unicode on DOS (was: RE: Latin w/ diacritics (was Re: benefits of unicode))

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Apr 18 2001 - 14:48:55 EDT


Marco wrote:

> James Kass wrote:
> > > [...] but would it really take *millions* of dollars for
> > > implementing Unicode on DOS or Windows 3.1?
> >
> > It could be done with, say, Ramon Czyborra's Unifont and QBasic.
>
> Why not? Or, even better, with a Unifont-derived BDF font and GNU C++.

Reason #1

Those are end-of-lifed, unsupported OS's. Sure there are plenty of
machines still running applications on them, but the handwriting is
on the wall for those machines -- even in impoverished countries,
schools, organizations, whatever -- simply because there is no
way forward if they want to change and update whatever it is that
they are doing on those machines.

Reason #2

The *tools* that run on those OS's are end-of-lifed and obsolete.
That makes them difficult to get a hold of and makes them unsupported, too.
Sure you can work in QBasic to do something, or in whatever other tool
you might happen to have, but QBasic is a sorry excuse for a
development tool, and you have no guarantees that you can share
the burden with the next guy over who may have a different obsolete
level of the OS and a different obsolete and unsupported tool.

Reason #3

OS's like DOS and Windows 3.1 don't scale well to the demands
of Unicode application support -- over and above the fact that
there is not and never will be any OS support. The 16-bit
segmented memory models, the incompatible memory driver extensions
for DOS in particular (anybody else recall the config.sys hell
for dealing with expanded memory and extended memory drivers to get
access past 640K base memory in DOS?), the file size limitations,
the limitations on disk size supported, and so on, all constrain
what you can do. Trying to program with tables in DOS, in
particular, hits up against all kinds of limits very quickly.
What seems like a *small* table in a 32-bit world quickly turns
into a *large* table in a 16-bit world, and then becomes a
programming problem. It isn't impossible, of course -- in point of
fact, the Unicode library I write for Sybase was ported to
Windows 3.1 up until about a year ago, but even when it was
running, I don't know of anyone who actually tried to use it
to support Unicode on Windows 3.1. And that was just a backend
library -- the frontend support is where the real problems would be.

Reason #4

Lack of marketing and distribution support. Anything you
develop on DOS or Windows 3.1 is going to be a total
anachronism, developed by a true believer and a small group
of acolytes, but with no obvious channel for promotion and
distribution. There can always be a small group that knows about
and promotes it for particular niche usage, but that, too,
doesn't scale well. If you are trying to create a "Unicode
solution" for people who are stuck with backlevel computer
equipment, how do you get your solution to them? Many a
startup company that *was* well-funded with millions of dollars
has foundered on this rock simply because even after they
had a clever solution in hand they couldn't find their
customers and their customers couldn't find them.

Reason #5

Creating anachronistic Unicode applications to run on DOS
or Windows 3.1 isn't really doing the intended beneficiaries
much of a favor. If I am still using an 80286 AT class 6 MHz
machine running MSDOS 5.0, I can't run a browser and access
the internet. That is a far more crippling limitation on
my IT enablement today than the fact that I don't have any
Unicode applications running on the machine. Nobody is going
to reengineer IE 5.5 (or even IE 4.0 or IE 3.0) in QBasic
to run on DOS 5.0.

Personally, I still own a working vintage 1983 Osborne Executive
CP/M computer beefed up to 512K memory. Turbo Pascal 2.0 still
runs on that machine, and I know enough about WordStar
patch hacking, RAM character generator programming, and
printer driver hacking that I could probably upgrade that
system to a limited form of Unicode support, given enough
time. But in the end, what would I have accomplished except
to show I was a macho programmer of anachronistic equipment?
Who would even be able to read the 180K 5-1/4" floppy disks
to use my Unicode data?

Aside on hacking --

James Kass objected to the negative connotations of the term
that Peter was using. But there are at least two distinct uses.
One is to hack *into* systems, i.e. the kind of illegal security
breaching associated with "hackers" in the media.

The other, widespread practice is just hacking software, akin
to the concept of the "hack up". This generally just means to
take software or a specification that somebody else wrote and
to tweak and twist it to do things it was never designed to
do in the first place. As a limited, quick fix for some problem,
this is often the most cost-effective thing to do, but as a
longterm approach, "hacking" is almost always bad, since it
introduces maintenance problems, is almost never extensible,
is often bug-ridden if patched further, and is less effective than
a redesign that addresses the new problem head-on.

Peter is talking about the second kind of hacking, which has
both good connotations (as clever, quick, cost-effective
fixes to immediate problems) and bad connotations (as unmaintainable,
expensive hairballs when the hacking programmer moves on).

Hacking up fonts or applications to provide Unicode support
on backlevel OS's may indeed be a good idea to meet some
immediate pressing need, but people who are doing this ought
at the same time to be aware that what they are doing has
a built-in, inescapable obsolescence factor (think androids
in Bladerunner), and it would only be prudent to also be planning
for technology jumps to whatever is in the mainstream of
software (and hardware) development when possible.

--Ken



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT