From: Mike (mike-list@pobox.com)
Date: Thu Apr 27 2006 - 13:53:04 CST
>> I am implementing the UCA and am having trouble
>> passing the conformance test....
>
> The problem lies in the interpretation of 'combining mark'. I'd taken
> it to mean a character with non-zero combining class. Moreover, I think
> this is what was intended!
That was the problem. I modified my code to stop
trying to form contractions when a combining mark
of class 0 is encountered. Now it passes the
conformance tests (as long as I throw out level
four collation data in the NON_IGNORABLE test).
> I was able to get through the test - once I'd decided that unpaired
> surrogates should not be converted to the replacement character!
Well I had to ignore the tests with surrogates in
them. All my code deals in UTF-8 strings, so to
be conformant in UTF-8 processing, an exception is
raised when a surrogate (paired or not) is found.
I am comfortable with that.
> I think the rule should be amended by replacing 'combining mark' by
> 'character of non-zero combining class', but a more elegantly phrased
> alternative would be still better.
Yes, that would eliminate the confusion.
Mike
This archive was generated by hypermail 2.1.5 : Thu Apr 27 2006 - 13:57:47 CST