From: Frank Ellermann (nobody@xyzzy.claranet.de)
Date: Sat Feb 03 2007 - 11:47:32 CST
Philippe Verdy wrote:
> There's another editorial(?) error in UTS#40, in rule RD6 which states:
> RD6 Reset Byte: If lb is equal to 0x20, then do not output any code point
> but set prev=0x40. Continue with the next byte sequence
> Of course it should not be 0x20 but 0xFF! otherwise it conflicts with
> rule RD3 (space).
Yes, obvious typo, the next and last line of RD6 says:
| * FF is a "reset-state-only" byte.
In another article you asked what it's good for. You could use it to
concatenate unknown (but otherwise valid) BOCU-1 strings. You could
also use it if a source contains no other (or not enough) code points
causing a reset to prev=0x40, i.e. for strictly non-ASCII sources,
(ignoring SP, that doesn't change the state).
A single bit damaged can destroy a complete "line" of anything not in
state prev=0x40. FF allows to limit the "line length" at risk. The
disadvantages of FF are clearly stated in the last paragraph of 2.4.
In a third article you noted that a signature FB EE 28 can't be simply
removed, it has a side effect on the state. That's true, it could be
noted in chapter 2.5. Using FB EE 28 FF (without side effect) is also
possible, but I think that's a dubious kludge. Nobody promised that
removing signatures is always possible without other effects.
I don't think it's a "serious bug", it's only a potential trap, and if
that's explicitly noted in chapter 2.5 it's a (harmless) "feature".
Frank
This archive was generated by hypermail 2.1.5 : Sat Feb 03 2007 - 11:51:04 CST