From: John Cowan (jcowan@reutershealth.com)
Date: Tue Dec 14 2004 - 07:54:53 CST
Doug Ewell scripsit:
> "When faced with [an] ill-formed code unit sequence while transforming
> or interpreting text, a conformant process must treat the first code
> unit... as an illegally terminated code unit sequence -- for example, by
> signaling an error, filtering the code unit out, or representing the
> code unit with a marker such as U+FFFD REPLACEMENT CHARACTER."
Plan 9, the original all-UTF-8 environment (it was translated
in a single day from Latin-1 to UTF-8), represents ill-formed code unit
sequences with the otherwise useless U+0080, on the grounds that an
ill-formed code is semantically different from an untranslatable
character, which is the purpose of U+FFFD.
-- LEAR: Dost thou call me fool, boy? John Cowan FOOL: All thy other titles http://www.ccil.org/~cowan thou hast given away: jcowan@reutershealth.com That thou wast born with. http://www.reutershealth.com
This archive was generated by hypermail 2.1.5 : Tue Dec 14 2004 - 07:59:25 CST