Re[2]: IDN problem.... :(

From: Alexander Savenkov (savenkov@xmlhack.ru)
Date: Fri Feb 18 2005 - 09:14:56 CST

  • Next message: Otto Stolz: "Re: Uppercase variant of U+00DF LATIN SMA LL LETTER SHARP S ("German s harp s", "Я" )"

    Hello,

    on 2005-02-18T13:35:46+03:00 Peter Kirk <peterkirk@qaya.org> wrote:

    > On 18/02/2005 06:00, Francois Yergeau wrote:

    >> Alexander Savenkov a ecrit :
    >>
    >>> Suppose I *want* to visit XML-документы.com (which is not unlikely).
    >>> My browser should *not* alert me. Never.
    >>>
    >>> The only solution to this seems to be for the registrar to check each
    >>> new domain name by hand (not necessarily with mixed scripts).
    >>
    >> I'm afraid that checking every registration by hand would be both too
    >> error-prone and too work-intensive. You'll probably have to put up
    >> with your browser alerting you. But perhaps good browsers will let
    >> you build up a white list, so that you need to suffer the alert only
    >> once?
    >>
    > The problem with this is that Alexander's example is neither unique nor
    > improbable, indeed I would expect thousands of such IDNs to be
    > registered, if they are allowed. In Cyrillic script and I think in many
    > other non-Latin scripts it is common practice to insert Latin script
    > technical terms, acronyms etc, especially for items relating to
    > computers and other modern technology.

    Indeed, the example is very probable. There are a lot of terms and
    abbreviations that are never translated including but not limited to
    modern technology.

    > Indeed this kind of usage has a
    > long history, see
    > http://ptolemy.tlg.uci.edu/~opoudjis/unicode/unicode_mixing.html section
    > 2. So there is a real need to allow some kinds of mixed script IDNs for
    > such circumstances.

    > Perhaps one way for this kind of mixed script name to be distinguished
    > from spoofing is to require a hyphen at the boundary between scripts,
    > as in Alexander's example.

    The sole reason I put a hyphen there was to obey the rules of Russian
    spelling. I cannot come up with a no-hyphen example right now but I'm
    sure there are multiple languages that actually have it.

    All I can say a hyphen or any other visual delimiter that breaks the
    natural spelling of the language is inacceptable for the reasons I
    mentioned before.

    Alexander

    -- 
      Alexander Savenkov                            http://www.xmlhack.ru/
      savenkov@xmlhack.ru             http://www.xmlhack.ru/authors/croll/
    


    This archive was generated by hypermail 2.1.5 : Fri Feb 18 2005 - 09:17:37 CST