From: Dominikus Scherkl (lyratelle@gmx.de)
Date: Thu May 19 2005 - 02:33:30 CDT
> For example: by using a
> modified UTF-8 format where a ASCII letter can be used as a
> switch selector between any local encodings - that method
> will allow to save A LOT of space for commonly used characters.
That already exists - use SCSU.
The only disadvantage of it is the ambiguity (a sequence of characters
can be encoded in different ways - so roundtripping
most likely will destroy sigatures and the like).
You should look carefuly at the standard.
The only problems with the "can't take it back" policy I
can see are:
- the characters used for a particular language become
discontinous. But: this is inevitable at least for characters
used by different languages that requires different orders
or uses more or less characters from that set. The problem
is solved by the collation algorithms. Any order of characters
will need such an algorithm for at least some languages, so
a new standard will make nothing easier.
- false or unnessessary characters can't be excluded. But:
noone requires you to use them, and they can be deprecated,
so they won't be used at all. Remains only a space-problem,
but for now the remaining place to encode new characters is
plenty. May be if it realy becomes too full, the Unicode
Consortium will change it's policy and declare the (then
many decades unused) deprecated characters as unassigned.
- Characters are encoded that you find obsolete or unnessesary.
But thats your problem, simply don't use them.
So, I don't see any evidence for a new standard.
-- Dominikus Scherkl
This archive was generated by hypermail 2.1.5 : Thu May 19 2005 - 02:34:25 CDT