Re: [idn] nameprep forbidden characters

From: Mark H. David (mhd@world.std.com)
Date: Sun Sep 17 2000 - 20:28:13 EDT


Others have spoken about Hebrew characters as used for the Hebrew language,
so I'll say something about the other major consumer of the Hebrew script, Yiddish.

For Yiddish, I also feel that the points should be ignored in domain names.
It may be even more important than for Hebrew. Yiddish is usually published
with some points. But there is considerable variety, historically, between
publishers, and then between individual users, as to how they write the words.
Especially in handwriting, people get lazy and leave off the points. Then there's
the problem that the software tools don't make it easy, if they make it possible
at all, to write using points, or using all points that one would wish, or all
points required by the prevailing orthography standard, or the standard of this
or that publishing house. Then there's the problem that people don't know how
to use the points to spell properly, even if they would wish to do so. (This is
MUCH less of a problem than for Hebrew -- Yiddish is written with a very
restricted number of combinations, 12 standardly -- a much simpler system, but
still a lot of people don't know the system.)

If points are ignored, it will allow domain names to be printed and
registered and used with the standard letter/point combinations,
without requiring the domain name to be input with the matching
points or any points at all. This will cause much less confusion and problems.

Regarding Mark's question of taking the trouble to type in the points with
Hebrew letters (when registering), it may be in order to spell the word
exactly right, presumably according to Yiddish standard orthography,
putting it into "canonical form", so to speak. It also may clarify the pronunciation
or disambiguate between words spelled the same except for the points,
which is also a good reason for using it for Hebrew. However, it would still,
in most cases, be a disappointment to most speakers if the domain
with the same Hebrew letters, unpointed, did not match such a pointed
domain name.

It's also crucial that if someone tries to get DNS to resolve
a Yiddish name, pointed in the standard way as it might well be, that it match
the unpointed domain name. Considering points for matching purposes would
make it harder to use points at all, and therefore would discourage their use
further, which would be worse for Yiddish, where pointing is more the rule
than the exception, than for Hebrew.

I think this approach will work well for all of the Jewish languages using the
Hebrew script (Hebrew, Yiddish, Ladino, et al). I'm not an expert in all
these languages, but I think I know enough about their orthography and
use of the Hebrew script to make this generalization.

It's not a perfect normalization scheme, but it's about as imperfect, useful,
and nonharmful, as the casefolding approach DNS uses, as applied
to English and other Latin-script languages.

Mark David
Moderator, UYIP, http://www.uyip.org/unicode/

----- Original Message -----
From: "Mark Davis" <markdavis@ispchannel.com>
To: "Unicode List" <unicode@unicode.org>
Cc: <wael.nasr@i-dns.net>; "Edmon" <edmon@neteka.com>
Sent: Sunday, September 17, 2000 12:58 PM
Subject: Re: [idn] nameprep forbidden characters

> I am curious why you feel so strongly that the Hebrew points should be ignored
> in domain names. Prima facie, it seems that there is little harm in treating
> them no differently from other characters. What problem would arise if the
> domain was ABC.COM and I could not get it by typing AB*C.COM? (Here uppercase
> stands for Hebrew, and * for a point.) Conversely, if someone really did
> register AB*C.COM, would it be a problem that I couldn't get to that location by
> typing ABC.COM?
>
> It is my understanding that the vowels are rarely used, and that people really
> wouldn't use them in registered domain names anyway. It seems that if someone
> did take the trouble to type in the points, that there would be a reason for
> their making such a distinction.
>
> I'd appreciate it if you could help me to understand the issue more clearly.
>
> Mark
>
> Jonathan Rosenne wrote:
>
> > We should distinguish "punctuation", like 060C Arabic Comma, and
> > "diacritics", such as 064E Arabic Fatha. Diacritics is probably the wrong
> > word. I have the impression that you were referring to the latter.
> >
> > For Hebrew, my opinion is that from the point of view of the user,
> > punctuation should be forbidden, while diacritics such as the vowels and
> > other combining characters should be allowed and be ignored.
> >
> > I believe it is important that the rules for Arabic and Hebrew should be the
> > same as far as possible.
> >
> > Jony
> >
> > > -----Original Message-----
> > > From: owner-idn@ops.ietf.org [mailto:owner-idn@ops.ietf.org]On
> > > Behalf Of Wael Nasr
> > > Sent: Saturday, September 16, 2000 1:16 AM
> > > To: Edmon; idn working group; Adam M. Costello
> > > Subject: RE: [idn] nameprep forbidden characters
> > >
> > >
> > > Wanted to share with you that in the arabic Working group of minc we have
> > > discussed this
> > > point at length.
> > > In arabic the meaning of the word will change depending on punctuation ,
> > > like the
> > > words "knowlege" and "flag" in arabic are exactly the same except for
> > > punctuation.
> > >
> > > It is my opinion that , at least regarding arabic, no punctuation
> > > should be
> > > allowed for now.
> > >
> > > I am sure 5 years from now , domain name systems will be much more dynamic
> > > than what
> > > we have now and will not be simply a simple mapping of unicode or ascii to
> > > an ip number.
> > > at that time, punctuation can be allowed to be part of the game.
> > > wael
> > >
> > > -------------------------------------------
> > > Wael Nasr
> > > Director, Middle East Business Development
> > > I-DNS.net
> > > wael.nasr@i-dns.net
> > > Cell Phone(Egypt):+(201) 222 55 380
> > >
> > > -----Original Message-----
> > > From: owner-idn@ops.ietf.org [mailto:owner-idn@ops.ietf.org]On Behalf Of
> > > Edmon
> > > Sent: Saturday, September 16, 2000 12:59 AM
> > > To: idn working group; Adam M. Costello
> > > Subject: Re: [idn] nameprep forbidden characters
> > >
> > >
> > > Perhaps host names
> > > > should avoid all punctuation in all languages so people don't have to
> > > > worry about it.
> > >
> > > I think we have to remember that it is the registrant's choice to choose a
> > > name that best reflects their identity online. Punctuations may
> > > serve to be
> > > great symbols that identifies an entity, for example a person
> > > called O'Brian
> > > would want to have the apostrophe for his domain name and a company A&B
> > > would want the "&" in their name. Our move to multilingual is the best
> > > opportunity for us to re-include these worthwhile and long awaited symbols
> > > back into the domain name space.
> > >
> > > Edmon
> > >
> > > >
> > > > AMC
> > >
> > >
> > >
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT