I did not mention Arabic vowels and Shadda because I don't feel qualified
to.
Jony
> -----Original Message-----
> From: Paul Hoffman / IMC [mailto:phoffman@imc.org]
> Sent: Saturday, July 08, 2000 8:06 PM
> To: Jonathan Rosenne; idn@ops.ietf.org
> Subject: RE: [idn] Preparation of Internationalized Host Names - Hebrew
>
>
> At 12:43 PM +0300 7/8/00, Jonathan Rosenne wrote:
> > > Please note that not all punctuation is prohibited. The rules for the
> >> specific kinds of punctuation that is prohibited are in the document.
> > > U+05C0, which looks just like the ASCII "vertical bar", is probably
> >> acceptable (since vertical bar is acceptable). U+05C3 looks just like
> >> a colon and is therefore not acceptable; thanks for pointing this
> >> out. (And I have noted it to the Unicode folks for when they update
> >> the standard).
> >
> >Its meaning is punctuation, like comma or full stop, never mind
> its shape.
>
> Exactly my point. At present, we do *not* prohibit all punctuation.
> The only prohibited punctuation are characters are that are reserved
> or delimiters in URLs [RFC2396] and [RFC2732]. If this group decides
> to prohibit all punctuation, certainly we would then prohibit U+05C0.
> Or, we might prohibit all punctuation other than a certain small
> group of characters (which would be pretty difficult to choose
> correctly...). But, for now, we only prohibit a small set.
>
> > > >2. Cantillation Marks
> > > >0591 to 05af
> > > >
> > > >These should be either prohibited or ignored since they do
> not affect
> >> >pronunciation, similar to ignoring case differences.
> >> >
> >> >Personally, I would rather prohibit them since their presence is
> >> most likely
> >> >to be an error.
> >>
> >> If they never appear in personal names, company names, or spoken
> >> phrases, then they can safely be prohibited. Is that true for all of
> >> them?
> >
> >They never appear in common use, they are only used in biblical texts.
>
> Thanks, that's what I wanted to hear. I'll prohibit them in the
> next draft.
>
> > > >2. Points
> >> >05b0 to 05c4
> >> >
> >> >These should be either prohibited or ignored since they are
> optional. In
> >> >modern Hebrew they are seldom used, not all systems support
> >> them, and it is
> >> >valid to omit them.
> >> >
> >> >Personally, I would rather ignore them because a user may enter
> >> them and why
> >> >not let him.
> >>
> >> This is much more problematic. We do not currently have any "ignored"
> >> characters. If I understand this correctly, the host name <HEBREW
> >> LETTER HE><HEBREW POINT SEGOL>.com looks and sounds different than
> >> <HEBREW LETTER HE><HEBREW POINT TSERE>.com, but could be considered
> >> the same for a host name. If so, I think we would have to prohibit
> >> them, not ignore them. Does that sound correct?
> >
> >They do sound different, but do not necessarily look different
> because it is
> >not mandatory to display points.
> >
> >Just like you ignore case in English, in Hebrew you should ignore points.
>
> From my (very limited) understanding of Hebrew, this makes sense.
> However, it means that we will have to make such other "ignoring"
> rules for a variety of scripts. I'm happy to do that if the group
> wants, but it certainly makes the name preparation harder. (Just to
> be clear: my personal preference would have been not to ignore case,
> but that decision was made *long* ago and cannot be reversed.) Doing
> so would require an extra step, probably between checking for
> prohibited characters and folding case, that says "look for any
> characters on this list and throw it away".
>
> How does the group feel about this? What other characters in scripts
> other than Hebrew would go here?
>
> --Paul Hoffman, Director
> --Internet Mail Consortium
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT