I found a small error in Technical Report #16, "UTF-EBCDIC."
In Section 3.5, "Signature," there is the following passage:
The signature character U+FEFF (zero width no-break space) of Unicode
transforms into the I8-byte sequence X'F1 BF B7 BF' which maps to
X'DD 73 66 73' in UTF-EBCDIC. When this sequence is displayed
(erroneously) using different a [sic] single-byte EBCDIC code pages,
it can be visualized as different character strings. In Latin-1
EBCDIC code page 1047 (and coincidentally also in Latin-1 code pages
500 and 37), this byte sequence appears as "ùËÃÊ" (small letter u
with grave, capital letter E with diaeresis, capital letter A with
tilde, capital letter E with circumflex).
If the 4-character I8-sequence contains two 0xBF bytes, and they both
map to 0x73 (as of course they must), then they will not be displayed
as the two different characters 'Ë' and 'Ê'. The text should read:
... this byte sequence appears as "ùËÃË" (small letter u with grave,
capital letter E with diaeresis, capital letter A with tilde, capital
letter E with diaeresis).
The stray "a" in the passage which I marked with "[sic]" was left in for
accuracy, but it is not the error I was referring to. The TR contains
several such typos, so it would be unfair to single this one out.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT