From: Doug Ewell (dewell@adelphia.net)
Date: Fri Feb 11 2005 - 10:28:22 CST
Marcin 'Qrczak' Kowalczyk <qrczak at knm dot org dot pl> wrote:
> Don't look for wrong patterns. Ensure that there is a good pattern
> instead. In particular characters not belonging to any regular writing
> system, like arrows or half-wide Latin letters, are rejected.
IDN strings go through a process called "nameprep" before being encoded
in Punycode. Nameprep is a combination of NFKC, case folding, removal of
control characters and space characters, etc. This means it should not
ever be possible to create a domain name like paypal.com (with ZWSP) or
paypal.com (with fullwidth Latin).
What nameprep explicitly does *not* do is attempt to create a mapping
between Latin p, Greek ρ, Cyrillic р, Cherokee Ꮲ, Deseret 𐑁, and so
forth. This is just too slippery and font-dependent. I've noticed
through the last few years that despite all the calls for
visual-similarity mapping tables, nobody actually volunteers to undertake
this project. Probably they stop as soon as they encounter the
"semi-confusables" like υ and к and realize it's not as simple an issue
as they thought.
I don't know about arrows, but it seems unlikely that these would be
useful for spoofing.
-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Fri Feb 11 2005 - 10:29:06 CST