David Starner, normally <dstarner98@aasaa.ofe.org> but on this occasion
<dvdeug@hushmail.com>, wrote:
> I was having some problems with a test of my SCSU decoder recently,
> and I discovered it was due to my decoder rejecting 10FFFF as a valid
> Unicode value (because it ends in FFFF.) The fourth test pattern,
> Section 9.4 of Tech Report 6 (SCSU) uses DBFF DFFF as a surrogate
> pair, which is 10FFFF. Is this wrong, or is there something I'm
> overlooking?
Good question. Unicode scalar values ending in FFFE and FFFE do not
represent valid characters, but by definition D29 (recently clarified
for me) a UTF must encode and decode these values. SCSU is not a UTF,
but my guess is that this requirement should apply to SCSU as well.
I think the SCSU decoder should go ahead and decode the 0B BF FF and
subsequent 15 FF as U+10FFFF, and leave the job of deciding which values
are valid or invalid to the higher-level process that interprets them.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT