Sun, 16 Sep 2001 01:14:06 -0700, Carl W. Brown <cbrown@xnetinc.com> pisze:
> If it can be demonstrated that there is a real need for an encoding
> like CESU-8 then is should be very different from UTF-8. How does
> SCSU for example sort?
SCSU encoding is non-deterministic and its representations can't
be compared lexicographically at all (logically equal strings might
compare unequal).
Ehh, we wouldn't have the problem with CESU-8 now if Unicode hadn't
been described as a 16-bit encoding in the past. I still think that
UTF-16 was a big mistake. Too bad that it still affects people who
avoid it.
We can't change the past, but I hope that at least UTF-8 processing can
be done without treating surrogates in any special way. Surrogates are
relevant only for UTF-16; by not using UTF-16 you should be free of
surrogate issues, except by having a silly unused area in character
numbers and a silly highest character number. Please don't spread
UTF-16 madness where it doesn't belong.
-- __("< Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTĘPCZA QRCZAK
This archive was generated by hypermail 2.1.2 : Sun Sep 16 2001 - 05:48:12 EDT