From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Thu Dec 06 2007 - 14:52:11 CST
Otto Stolz wrote:
> Andreas Prilop wrote:
>> So with respect to rot13, U+00E3 and U0061 U+0303
>> are not equivalent.
>
> This means that a naïve rot13 implementation will not be standard
> conforming w. r. t.
> <http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf>, clause C6.
I wouldn't say so. Clause C6 does not require that canonical equivalents
be treated as identical. Rather, that you must not rely on others
treating them as different. I think the standard says this rather
explicitly.
> But then, rot13 has never claimed to be so.
Rot13 is supposed to be a simple reversible obfuscation method that
prevents people from _accidentally_ seeing some text they might not want
to see. I think it does this relatively well for dominantly ASCII text.
It operates at plain text level, so no special software is needed,
except for a very trivial converter that often appears as a built-in
feature in e-mail programs.
> Of course, you could come up with a Unicode-capable, standard
> conforming rot13 implementation -- but what purpose could it ever
> serve?
Well, the same purpose. But rot13 is by definition for a limited
character repertoire. There are many ways to generalize it. But if the
generalization needs to coincide with rot13 for ASCII letters, there is
no method that would result in anything as simple as rot13. One
possibility would be to give up the rotation idea and simply add 13 to
each code number. Then the decoding algorithm would not any more be the
same as the encoding algorithm but would use subtraction. But it
obfuscate sufficiently and simply - though only at the code number
level. Doing that to utf-18 data would require more processing.
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Thu Dec 06 2007 - 14:53:56 CST