From: Hans Aberg (haberg@math.su.se)
Date: Sun Apr 12 2009 - 02:32:24 CDT
On 12 Apr 2009, at 03:05, Mark Davis wrote:
> I agree. One needs to distinguish the ASCII characters from the
> ASCII encoding scheme.
>
> The ASCII characters are represented in Unicode at codepoints U
> +0000..U+007F. The ASCII encoding scheme represents these as bytes
> %00..%7F, as does the UTF-8 encoding scheme. Other encoding schemes,
> like EBCDIC CCSID 500, may use different byte sequences for the
> ASCII characters.
The problem is that ASCII does not have any such separation - it all
takes place within Unicode.
So the correct statement would be something like "the Unicode
character subset set in bijective correspondence with via the
canonical injection from the ASCII character set into the Unicode
character set".
So my statement might perhaps be simplified to (omitting the details
of the construction of the injection):
This link overlooks the possibility of using a Unicode character
subset not contained in the Unicode character subset set in bijection
with the ASCII character set via the canonical injection from the
ASCII character set into the Unicode character set for English
spelling reform.
It becomes so much clearer.
Hans
This archive was generated by hypermail 2.1.5 : Sun Apr 12 2009 - 02:35:42 CDT