RE: [idn] nameprep forbidden characters

From: Jonathan Rosenne (rosenne@qsm.co.il)
Date: Wed Sep 27 2000 - 12:50:40 EDT


See my comments inline.

Jony

> -----Original Message-----
> From: Mark Davis [mailto:markdavis@ispchannel.com]
> Sent: Sunday, September 17, 2000 10:40 PM
> To: Jonathan Rosenne
> Cc: Unicode List; wael.nasr@i-dns.net; Edmon; bidi@unicode.org
> Subject: Re: [idn] nameprep forbidden characters
>
>
> I'm not trying to argue with you on this issue -- it may very
> well be best for points to be ignored. But I do want to
> understand the situation a bit better. My questions below should
> not be taken as rhetorical criticism, but simply as questions for
> clarification.
>
> For others, I am also interested in the situation vis-a-vis
> Arabic, whether we should treat it the same as Hebrew in terms of
> the vowel marks (fatha, etc.).
>
> Mark
>
> Jonathan Rosenne wrote:
>
> > Why should case be ignored in English?
>
> Except for an extremely small set of edge cases (such as Polish
> vs polish, God vs god), there is no extra meaning attached to case.

In the context of identifiers such as domain names, I believe the
justification for ignoring case in English is related to convenience and
user friendliness.

Unless it is a leftover form the 6 bit days.

>
> > In Hebrew, points are optional. The word is the same with them
> and without them, or with just some of them.
>
> I had thought that there were many words with the same base
> letters, but different pronunciations (and meaning), and that
> different vowels would be used for the different pronunciations.
> That's the way for Arabic, and I had assumed it was the same for
> Hebrew. Is that not the case? From the base
> letters in each word are the vowels always predictable, so that
> they are completely optional?

There are homonyms in Hebrew, just as there are in most languages. Some can
be resolved with points, some cannot. Some platforms support points, some do
not, and some do but at some inconvenience. Newspapers can use points, and
do it sparingly, mainly to disambiguate homonyms - say about once per sheet.

>
> > In addition, not all systems support them, and when they do
> most users don't know how to type them. It isn't easy - see
> http://www.qsm.co.il/NewHebrew/wniqud.htm
> >
> > A domain owner could publish it with points, to clarify the
> pronunciation, but many users would type it without them or even
> get them wrong.
>
> Do you think that it is a realistic case, that a domain owner
> would use need to points in that manner, and that a significant
> fraction of domain owners would do this?

Not a large number.

>
> > The issue has been discussed at the Hebrew WG of the SII and I
> think there is general agreement on this issue. We plan a paper
> some time in the future.
> >
> > I feel that when identifiers are case sensitive, such as in C,
> there may be a case for respecting points, although this would
> cause a problem with cross-system portability, but where case is
> ignored, such as in domain names, the emphasis is more on the
> pronunciation rather than the exact spelling.
>
> I didn't quite get the last sentence. I had thought that the
> vowel marks were used to get the exact pronunciation. If that is
> not true, it may be part of my misunderstanding of the situation.

Points are more than pronunciation, because in modern Hebrew we do not
distinguish between long and short vowels and we do not pronounce the Dagesh
except in three letters.

In summary, we have two alternatives: to disallow points, or to allow them
and ignore them. I think the latter is more friendly.

>
> > Jony
> >
> > > -----Original Message-----
> > > From: Mark Davis [mailto:markdavis@ispchannel.com]
> > > Sent: Sunday, September 17, 2000 7:58 PM
> > > To: Unicode List
> > > Cc: wael.nasr@i-dns.net; Edmon
> > > Subject: Re: [idn] nameprep forbidden characters
> > >
> > >
> > > I am curious why you feel so strongly that the Hebrew points
> > > should be ignored
> > > in domain names. Prima facie, it seems that there is little harm
> > > in treating
> > > them no differently from other characters. What problem would
> arise if the
> > > domain was ABC.COM and I could not get it by typing AB*C.COM?
> > > (Here uppercase
> > > stands for Hebrew, and * for a point.) Conversely, if someone
> really did
> > > register AB*C.COM, would it be a problem that I couldn't get to
> > > that location by
> > > typing ABC.COM?
> > >
> > > It is my understanding that the vowels are rarely used, and that
> > > people really
> > > wouldn't use them in registered domain names anyway. It seems
> > > that if someone
> > > did take the trouble to type in the points, that there would be a
> > > reason for
> > > their making such a distinction.
> > >
> > > I'd appreciate it if you could help me to understand the issue
> > > more clearly.
> > >
> > > Mark
> > >
> > > Jonathan Rosenne wrote:
> > >
> > > > We should distinguish "punctuation", like 060C Arabic Comma, and
> > > > "diacritics", such as 064E Arabic Fatha. Diacritics is probably
> > > the wrong
> > > > word. I have the impression that you were referring to the latter.
> > > >
> > > > For Hebrew, my opinion is that from the point of view of the user,
> > > > punctuation should be forbidden, while diacritics such as
> the vowels and
> > > > other combining characters should be allowed and be ignored.
> > > >
> > > > I believe it is important that the rules for Arabic and Hebrew
> > > should be the
> > > > same as far as possible.
> > > >
> > > > Jony
> > > >
> > > > > -----Original Message-----
> > > > > From: owner-idn@ops.ietf.org [mailto:owner-idn@ops.ietf.org]On
> > > > > Behalf Of Wael Nasr
> > > > > Sent: Saturday, September 16, 2000 1:16 AM
> > > > > To: Edmon; idn working group; Adam M. Costello
> > > > > Subject: RE: [idn] nameprep forbidden characters
> > > > >
> > > > >
> > > > > Wanted to share with you that in the arabic Working group of
> > > minc we have
> > > > > discussed this
> > > > > point at length.
> > > > > In arabic the meaning of the word will change depending on
> > > punctuation ,
> > > > > like the
> > > > > words "knowlege" and "flag" in arabic are exactly the
> same except for
> > > > > punctuation.
> > > > >
> > > > > It is my opinion that , at least regarding arabic, no punctuation
> > > > > should be
> > > > > allowed for now.
> > > > >
> > > > > I am sure 5 years from now , domain name systems will be much
> > > more dynamic
> > > > > than what
> > > > > we have now and will not be simply a simple mapping of
> > > unicode or ascii to
> > > > > an ip number.
> > > > > at that time, punctuation can be allowed to be part of the game.
> > > > > wael
> > > > >
> > > > > -------------------------------------------
> > > > > Wael Nasr
> > > > > Director, Middle East Business Development
> > > > > I-DNS.net
> > > > > wael.nasr@i-dns.net
> > > > > Cell Phone(Egypt):+(201) 222 55 380
> > > > >
> > > > > -----Original Message-----
> > > > > From: owner-idn@ops.ietf.org
> > > [mailto:owner-idn@ops.ietf.org]On Behalf Of
> > > > > Edmon
> > > > > Sent: Saturday, September 16, 2000 12:59 AM
> > > > > To: idn working group; Adam M. Costello
> > > > > Subject: Re: [idn] nameprep forbidden characters
> > > > >
> > > > >
> > > > > Perhaps host names
> > > > > > should avoid all punctuation in all languages so people
> > > don't have to
> > > > > > worry about it.
> > > > >
> > > > > I think we have to remember that it is the registrant's
> > > choice to choose a
> > > > > name that best reflects their identity online. Punctuations may
> > > > > serve to be
> > > > > great symbols that identifies an entity, for example a person
> > > > > called O'Brian
> > > > > would want to have the apostrophe for his domain name and a
> > > company A&B
> > > > > would want the "&" in their name. Our move to multilingual
> > > is the best
> > > > > opportunity for us to re-include these worthwhile and long
> > > awaited symbols
> > > > > back into the domain name space.
> > > > >
> > > > > Edmon
> > > > >
> > > > > >
> > > > > > AMC
> > > > >
> > > > >
> > > > >
> > >
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT