From: Joó Ádám (ceriak@gmail.com)
Date: Fri Jan 09 2009 - 05:19:57 CST
> I've thought about this. But since you would want to intermix text
> and non-text, it makes sense to retain Unicode as a subset and use
> the same UTF encoding schemes. The problem, though, is that Unicode
> claims all the code points, so a new standard would have to violate
> the rules, either by using planes that Unicode will probably never
> use(*), or by going beyond plane 16 (which is impossible with UTF-16
> and specifically disallowed for UTF-8 and UTF-32 conformance).
So you got back to the original problem, and just realized that
Unicode cannot save the world, and you just can't use one single
encoding to represent any kind of data, since different data requires
different binary representation based on its characteristics, at least
if our goal is efficiency.
Maybe at some point Unicode just drawn the longbow, and wanted to be
more than it's supposed to be?
The real problem with legacy character encoding was on the one hand
bad design, and on the other hand bad or lack of use (or existence?)
of appropriate markup.
Having different encodings for different media is natural, even for
different subsets of ones elements – you only have to design them
well.
Regards,
Ádám
This archive was generated by hypermail 2.1.5 : Fri Jan 09 2009 - 05:21:58 CST