Re: [idn] IDN spoofing

From: George W Gerrity (g.gerrity@gwg-associates.com.au)
Date: Mon Feb 21 2005 - 06:51:05 CST

Next message: Patrick Andries: "diacritic spoofing (Re: orthographies)"

Previous message: JÃ¶rg Knappen: "Re: Uppercase variant of U+00DF LATIN SMA LL LETTER SHARP S ("German sharp s", "ß" )"
Maybe in reply to: Erik van der Poel: "Re: [idn] IDN spoofing"
Next in thread: Erik van der Poel: "Re: [idn] IDN spoofing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 21 Feb 2005, at 22:36, William Tan wrote:

> George W Gerrity wrote:
>
>> For the second-level (or third-level where the top is a country code)
>> domain tag, it should be the legal responsibility of the name
>> authorities for the domain above to ensure that spoofed names cannot
>> be registered (or if registered, all belong to one owner). In the
>> Western world, if that is not already the case, then I'm sure that
>> the first time a spoof of, say Coca-Cola (or Pepsi — let's be
>> even-handed) is registered, then we can be certain that afterwards,
>> the issuing authority will never do it again.
>
> While it is true that TLDs are responsible for preventing the
> registration of spoofs, commercial TLDs that have automated
> registration systems never perform that check.

I'm suggesting that pretty soon they will start, or get sued by the big
boys. It would be a help if the terms of their licensing included a
requirement to take all reasonable steps to disallow spoofs. Otherwise,
if name handling is automatic, being a TLD Authority is just a license
to print money.

> Does registering coca-cola.com prevent someone else from getting
> coca-co1a.com?

Quite obviously, it doesn't right now, until (not if) the “real”
coca-cola comes down like a ton of bricks on the TLD Authority (rather
than the spoofer: deepest hip pocket principle). I don't know if this
particular case has occurred, but certainly there have been court
cases, fought successfully, against people who have jumped in and
registered a trade-mark name as part of a domain. As the usage and law
of domain names mature, a natural extension is to include spoofs as
well as trade marks in the list of names not registrable except by the
owner of the trade-mark or non-spoof tag.

>> In the case of countries whose law systems are still a bit wild and
>> wooly (The former Soviet Union?), then I suspect that for the time
>> being it will remain ‘Caveat Emptor’. In either case, a domain name
>> holder should be able to license all spoofs for free, in order to
>> limit its exposure to spoofing, whether or not there is adequate
>> legal recourse.
>
> If the TLD operator is careful, there is no need to license spoofs to
> protect one's domain from being spoofed. On the other hand, if the TLD
> does not even perform that check (such as .com), then it is unlikely
> that you get to license all spoofs for free anyway - you have to pay
> for each and every permutation of it.

Hence the reason — in the short term — for allowing the owner of an
original to register all spoofs free of charge. Currently, I believe
that some big internationals are already doing that — ie, registering
all lookalikes in every conceivable domain. They can afford it:
start-ups can't. When (not if) the law suits start to come in, TLD
operators will be happy to license spoofs for free to the legit holder
of a name (or at least, add them to their tables of non-licensable
names), because it will mean that fewer number of potential court
cases.

>> The point I'm making is that while the authorities for .com.au or
>> .com.ru may do what they like, we can at least give them advice plus
>> some tables that will detect many, if not most, spoofs. In the case
>> where the authority allows (for whatever reason) a name with mixed
>> orthographies, then clearly the first to apply whose signature is not
>> a spoof for an (already well-established) trade-marked name or domain
>> name, should get the license, and all other applicants with a similar
>> name be refused. The name authority should be protected by the laws
>> of the countries in which it operates from being sued for refusing to
>> register confusable names.
>
> This is a fairly interesting proposal, i.e. to use the bundling (see
> draft-klensin-reg-guidelines or rfc3743) to solve the homograph
> problem at the registry level, provided we can come up with a
> satisfactory table of lookalikes.

No reason why we should look around too much for homographs, which, as
has already been said, depend so much on the actual fonts use. I can't
imagine that, for instance, a hit on comparing greek lc omega shouldn't
always be termed a homograph for latin w. It is then up to the registry
to determine if it will allow isolated greek omegas to appear in an
otherwise latin string, or even if it will allow any sort of mixed
string.

> As an example, the word "coke" can be represented completely in
> Cyrillic homographs, so one can generate 16 combinations of ASCII and
> Cyrillic characters forming strings that look like "coke". When you
> register "coke.com", the other 16 variants are automatically tied to
> this domain (for free or for a fee). They can be either all activated
> (put into the zone file) or simply blocked from registration.
>
> The good thing about this is that the lookalikes mapping table does
> not have to be set-in-stone at the protocol level, but individual
> registries may choose to implement whatever makes sense for them.

Exactly. But you can be certain that as the domain system matures, the
rules at TLD registries will tighten up, if only so that they can
automate the finding of spoofs of legitimate, already-registered names.

> The problem with this is that the number of variants gets out of hand
> pretty quickly, and most registry systems aren't equipped to deal with
> bundles.

Yep. That's why the rules will tighten up. For instance, the .com.ru
TLD might refuse names containing characters from Greek or Coptic code
groups, and might even refuse to register names containing mixes of
cyrillic and roman characters, where the string sizes of the characters
in one set are less than two, or where there are more than three mixed
substrings. That reduces the size of the lookup tables considerably. In
your example of ‘coke’, there are only two combinations of substrings
(each from one set, but differing from the other substrings) where all
are of size > 1.

However, while these suggestions will be a help in forming rules, and
the existence of lists or tables of homographs will be welcome, in the
long run it will be up to the registries to get it right, or they will
find themselves out of business due to legal costs.

George

Next message: Patrick Andries: "diacritic spoofing (Re: orthographies)"
Previous message: JÃ¶rg Knappen: "Re: Uppercase variant of U+00DF LATIN SMA LL LETTER SHARP S ("German sharp s", "ß" )"
Maybe in reply to: Erik van der Poel: "Re: [idn] IDN spoofing"
Next in thread: Erik van der Poel: "Re: [idn] IDN spoofing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Feb 21 2005 - 06:52:04 CST