Doug,
On 12/19/2016 6:08 PM, Doug Ewell wrote:
> I thought there was a corrigendum or other, comparatively recent
> addition to the Standard that spelled out how replacement characters
> are supposed to be substituted for invalid code unit sequences --
> something about detecting maximally long sequences. I'll look when I
> have a chance.
>
You found the resulting text in TUS 9.0, p. 126 - 129. The origin of the
text there about best practices for using U+FFFD was the discussion and
resolution of PRI #121 in August, 2008:
http://www.unicode.org/review/pr-121.html
That was discussed at UTC #116. See the minutes:
http://www.unicode.org/L2/L2008/08253.htm
There was feedback at the time advocating the 3rd option, rather than
the 2nd one that was eventually chosen by the UTC. See:
http://www.unicode.org/L2/L2008/08280-pri121-cmt.txt
The actual text that resulted was first published in Unicode 5.2, p. 95:
http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf
Contrast that with the text in Unicode 5.0, which had no extended
discussion about handling conversion errors there. The Unicode 5.2 text
was later expanded with more definitions and explanation, to what you
see now in Unicode 9.0.
--Ken
Received on Tue Dec 20 2016 - 11:00:01 CST
This archive was generated by hypermail 2.2.0 : Tue Dec 20 2016 - 11:00:02 CST