10646 subsets (was: Re: Devanagari enthousiasm!)

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Mar 06 2002 - 16:58:00 EST


Michael Everson said:

> >No, a Unicode font does not need to contain Latin letters.
>
> A valid ISO/IEC 10646 subset must contain ASCII.

Besides others pointing out the obvious disconnect
between 10646 subsets and what can be in a valid
Unicode font (which contains glyphs, not characters),
this statement is not correct even in its proper
context. To cite chapter and verse:

10646 defines two kinds of subsets:

Limited subsets (clause 12.1) are simply enumerations of
any list of code points. ("code positions" in 10646-speak)
There are no constraints on this kind of subset, so
it could consist merely of a list of Hebrew combining
marks, for example.

Selected subsets (clause 12.2) consist of lists of
collections from Annex A. It is *selected* subsets
which automatically contain U+0020..U+007E. And
note that it is only *those* code points which
are included, and not "ASCII" -- which would also
imply inclusion of U+0000..U+001F and U+007F.

--Ken



This archive was generated by hypermail 2.1.2 : Wed Mar 06 2002 - 16:51:34 EST