From: Doug Ewell (doug@ewellic.org)
Date: Sat Apr 11 2009 - 14:26:58 CDT
Hans Aberg <haberg at math dot su dot se> wrote:
>> The set of ASCII characters is a proper and intact subset of the set
>> of Unicode characters.
>
> Is this really true?
>
> I though ASCII defined its characters as bytes, whereas Unicode uses
> code-points which when mapped using UTF-8 will contain the ASCII as a
> subset.
The *set of characters* in ASCII is a proper and intact subset of
Unicode. How these characters are represented inside computer storage
and transmission protocols may be defined differently, and doesn't
affect my argument that "ASCII characters" and "Unicode characters" are
not disjoint sets.
Actually, I was under the impression that ASCII was defined in terms of
7-bit code units, whereas there are virtually no computers or users
today who think in terms of 7-bit code units.
-- Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
This archive was generated by hypermail 2.1.5 : Sat Apr 11 2009 - 14:29:44 CDT