From: Dean Snyder (dean.snyder@jhu.edu)
Date: Thu May 19 2005 - 01:08:46 CDT
Doug Ewell wrote at 10:15 PM on Tuesday, May 17, 2005:
>Now, in keeping with this, what problems does Unicode present that will
>lead to its replacement by something better?
Here, off the top of my head, are some problems with Unicode which,
cumulatively, could prove its undoing:
Needless complexity
Stateful mechanisms
No support for a clean division between text and meta-text
Errors in actual content
Legacy sludge
Irreversibility
>How will the "something better" solve these problems without
>introducing new ones?
Subsequent encoding efforts will be better because they will have
learned from the mistakes of earlier encoders ;-)
Probably the single most important, and extremely simple, step to a
better encoding would be to force all encoded characters to be 4 bytes.
>How will it meet the challenge of transcoding untold amounts
>of "legacy" Unicode data?
Transcoding Unicode data into some new standard could at least be done
in ways similar to the ways pre-Unicode data is being transcoded into
Unicode now - an almost trivial pursuit.
>How will it respond to the inevitable objections from supporters
>of other encoding systems as Unicode has done?
Hopefully:
With no arrogance.
With broader cooperation.
With greater deliberation and less haste.
With more accumulated intelligence.
With better architectural design.
Don't get me wrong. I think ISO 10646/Unicode is, for the most part, a
wonderful pioneering effort to digitize the world's scripts. And there
is no doubt that all future encoders will make mistakes too. But I do
believe that hubris, intolerable in such matters, has unfortunately led
to short-sighted mistakes in both the architecture and content of
Unicode, mistakes Unicode is saddled with in perpetuity.
As just one example of the kind of architectural change that could drive
new encoding schemes, one could propose an encoding design that self-
references its own mutability, thereby redefining "stability" to include
not only extensibility but also reversibility. This would be
accomplished by dedicating as version indicators, e.g., 7 of the 32 bits
in every 4 byte character.
Dean A. Snyder
Assistant Research Scholar
Manager, Digital Hammurabi Project
Computer Science Department
Whiting School of Engineering
218C New Engineering Building
3400 North Charles Street
Johns Hopkins University
Baltimore, Maryland, USA 21218
office: 410 516-6850
cell: 717 817-4897
www.jhu.edu/digitalhammurabi/
http://users.adelphia.net/~deansnyder/
This archive was generated by hypermail 2.1.5 : Thu May 19 2005 - 10:14:44 CDT