Re: ASCII as a subset of Unicode (was: Re: Oxford proposes a leaner alphabet)

From: Hans Aberg (haberg@math.su.se)
Date: Sun Apr 12 2009 - 02:32:24 CDT

  • Next message: Don Osborn: "RE: ASCII as a subset of Unicode (was: Re: Oxford proposes a leaner alphabet)"

    On 12 Apr 2009, at 03:05, Mark Davis wrote:

    > I agree. One needs to distinguish the ASCII characters from the
    > ASCII encoding scheme.
    >
    > The ASCII characters are represented in Unicode at codepoints U
    > +0000..U+007F. The ASCII encoding scheme represents these as bytes
    > %00..%7F, as does the UTF-8 encoding scheme. Other encoding schemes,
    > like EBCDIC CCSID 500, may use different byte sequences for the
    > ASCII characters.

    The problem is that ASCII does not have any such separation - it all
    takes place within Unicode.

    So the correct statement would be something like "the Unicode
    character subset set in bijective correspondence with via the
    canonical injection from the ASCII character set into the Unicode
    character set".

    So my statement might perhaps be simplified to (omitting the details
    of the construction of the injection):
       This link overlooks the possibility of using a Unicode character
    subset not contained in the Unicode character subset set in bijection
    with the ASCII character set via the canonical injection from the
    ASCII character set into the Unicode character set for English
    spelling reform.

    It becomes so much clearer.

       Hans



    This archive was generated by hypermail 2.1.5 : Sun Apr 12 2009 - 02:35:42 CDT