Doug Ewell wrote:
> Marco Cimarosti <Marco.Cimarosti@icl.com> wrote:
> > a, B, c, e, H, i, j, K, M, n, o, p, s, T, u, x, or y to be
> This is a potential can of worms, because "look the same" is not a
> Boolean property for glyphs. What about U+0076 LATIN SMALL LETTER V
> and U+03BD GREEK SMALL LETTER NU, for example, or U+0070 LATIN SMALL
> LETTER P and U+03C1 GREEK SMALL LETTER RHO? These pairs do not look
> 100% identical, but would probably still confuse a user who does not
> expect a URL to contain characters from mixed scripts.
> The point is that with 50,000 possible characters, there is no place
> you can safely draw this line.
Good points. I agree that there will never be a "perfect" solution, and
there will always remain enough room for misunderstanding, and even for
This is not different from many other similar issues: there will never be
"the" perfect collation, or case conversion, or loose match.
This does not mean, however, that partial best-fit solution cannot be
> The same could be said for the fuzzy second-step category "all
> characters that are not essential."
Agreed. Nevertheless, you cannot simply say "OK, then you gotta type in a
string which is identical, at the binary level, with the one stored in the
DNS server, or you won't connect to the server".
So, again, I was just speculating about viable improvements to an overly
na´ve idea, with no illusion of finding an alchemical perfect solution.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT