> Sorry, but I have to disagree here. If a list of strings contains items
> with lone surrogates (garbage), then sorting them doesn't make the
> garbage go away, even if the items may be sorted in "correct" order
> according to some criterion.
Well, yeah, I wasn't claiming that the principled, "correct" output made the garbage go away.
Let me put it this way: if my choices are 1) garbage in, garbage reliably sorted out into garbage bin, versus 2) garbage in, sorting fails with exception, then I'll pick #1. ;-)
To give a concrete example, my implementation of UCA reliably passes the SHIFTED test cases in the conformance test, even though those test cases (deliberately) contain some ill-formed strings. If I instead did validation testing on input strings in my base implementation, it would be slower, *and* to pass the conformance test I would have to add a separate preprocessing stage that probed all the input data for ill-formed strings and filtered those cases out before engaging the test, so that it wouldn't fail with an exception when it hit the bad data.
--Ken
Received on Tue Jan 08 2013 - 12:32:54 CST
This archive was generated by hypermail 2.2.0 : Tue Jan 08 2013 - 12:32:56 CST