From: JR (jr@qsm.co.il)
Date: Fri Nov 18 2005 - 09:30:17 CST
> -----Original Message-----
> From: unicode-bounce@unicode.org
> [mailto:unicode-bounce@unicode.org] On Behalf Of Neil Harris
> Sent: Friday, November 18, 2005 4:14 PM
> To: Mark Davis
> Cc: Michael Everson; Unicode Discussion
> Subject: Re: Hebrew script in IDN (was Exemplar Characters)
>
>
> Mark Davis wrote:
> > It is not that clear-cut. Identifiers by their nature
> cannot include
> > all words and phrases valid in all languages. For IDN, for example,
> > one can't express the perfectly reasonable English word
> "can't", or a
> > word like "I.B.M.".
> >
> > I did introduce a proposal in March for considering the
> status of some
> > word characters, which turned into a discussion into the UTC of
> > whether to add certain items to the identifier definition.
> >
> > http://www.unicode.org/L2/L2005/05083-wordprops.txt
> >
> > (I'll copy that section here for those without access:
> >
> > 0027 ; # Po APOSTROPHE
> > 002D ; # Pd HYPHEN-MINUS
> > 002E ; # Po FULL STOP
> > 003A ; # Po COLON
> > 00B7 ; # Po MIDDLE DOT
> > 058A ; # Pd ARMENIAN HYPHEN
> > 05F3 ; # Po HEBREW PUNCTUATION GERESH
> > 05F4 ; # Po HEBREW PUNCTUATION GERSHAYIM
> > 200C ; # Cf ZERO WIDTH NON-JOINER // for Indic?
> > 200D ; # Cf ZERO WIDTH JOINER // for Indic?
> > 2010 ; # HYPHEN
> > 2019 ; # Pf RIGHT SINGLE QUOTATION MARK
> > 2027 ; # Po HYPHENATION POINT
> > 30A0 ; # Pd KATAKANA-HIRAGANA DOUBLE HYPHEN
> >
> >
> > The UTC decided that against adding them to the identifier
> definition.
> > If we were to change that for the Hebrew punctuation, we
> would have to
> > see a documented case for it.
> >
> > Mark
> >
>
> Mark,
>
> I think you might meet some opposition to including the
> following in IDNs:
>
> APOSTROPHE (?protocol character)
> FULL STOP (it's a label separator: so no chance for use in IDN labels)
> COLON (a definite protocol character in URLs)
> ZWNJ and ZWJ (unless Indic experts can make a _very_ good
> case for these
> being used only in contexts where they cause _visible_ and
> _unambiguous_
> rendering changes)
> RIGHT SINGLE QUOTATION MARK (spoof of APOSTROPHE)
> HYPHENATION POINT (spoof of MIDDLE DOT)
> KATAKANA-HIRAGANA DOUBLE HYPHEN (spoof of EQUALS SIGN,
> ?protocol character)
>
> which leaves only
>
> 00B7 ; # Po MIDDLE DOT
> 058A ; # Pd ARMENIAN HYPHEN
> 05F3 ; # Po HEBREW PUNCTUATION GERESH
> 05F4 ; # Po HEBREW PUNCTUATION GERSHAYIM
>
> as characters which I would consider possible uncontroversial
> candidates
> for IDN.
Certainly Geresh and Gershayim are not uncontroversial.
Jony
>
> -- Neil
>
>
>
>
This archive was generated by hypermail 2.1.5 : Fri Nov 18 2005 - 09:37:00 CST