[Hoping the shubnet doesn't got this one too . . .]
WTF-8 could potentially be as compact or more compact than UTF-8 (for
Greek, Arabic ...), since much of the Latin-1 and Latin Extended A blocks
aren't needed in WCode. If you moved the other characters down to
fill that space, you might win what you lost to C1 compatibilty.
I've considered writing up my own WCode (just for the heck of it) before.
My big fix would be losing ASCII compatibility(!), which allows us to
remove redundant and ill-defined controls and characters (ASCII
apostraphe! CF-LF!). Move the basic set of controls (LS, PS, ZWJ, etc.)
and the basic set of script-neutral punctionation and characters
(.,:;?!; possibly the Indo-European (Arabic?) digits 0-9) into the
bottom 128, followed by the combinging characters and then
the decomposed Latin and so on. Losing ASCII compatibilty is
much more radical than you've proposed, though.
-- David Starner - dstarner98@aasaa.ofe.org Pointless (and temporaily down) webpage: http://dvdeug.dhis.orgFree, encrypted, secure Web-based email at www.hushmail.com
This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:15 EDT