From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Fri Jul 18 2003 - 07:07:36 EDT
Philippe Verdy wrote:
> MES-2 is a collection of characters independant of their actual
encoding.
> To support MES-2 in a Unicode-compliant application, extra characters
> need to be added, notably if the minimum requirement for information
> interchange is the NFC form used by XML and HTML related standards.
The Unicode normal forms (for a particular version of Unicode) is
defined
for ALL of the characters in that version. There is no concept of a
Unicode normal form for a subset of the characters in a particular
version.
However, the MESes (there are four of them!) are useful for specifying
"minimum European" font coverage, and "input method" support (the
latter need not be via keyboard).
This is not to say that the MESes are unproblematic. To mention just
two points not already mentioned: none of the "new" math characters
are included even in MES-3 (a, b), despite that "all" math characters
were supposed to be included, and not even MES-3 covers all official
minority languages.
> It would be interesting to inform CEN about how MES-2 can be
> documented to comply with all normative Unicode algorithms, and
> the minimum is to ensure the NFC closure of this subset, which
> should have better not included compatibility characters canonically
> decomposed to singleton decompositions, and should now reintegrate
> the missing NFC form.
I think it is [extremely] unlikely at this point to expect anyone to
change,
or add new, MESes. Note that implementors are in no way prohibited
from supporting (in fonts, plus rendering software, and some form of
input) more than the MESes state. (But as Philippe states, there are
some
rather useless characters that have been included for compatibility
reasons.)
/kent k
This archive was generated by hypermail 2.1.5 : Fri Jul 18 2003 - 08:01:38 EDT