From: Mark Davis (mark.davis@jtcsv.com)
Date: Sun Apr 03 2005 - 14:16:31 CST
Each character in the FOR REVIEW list is collected because either:
(a) it would not count as part of an XID, or
(b) it is part of a bicameral script and doesn't have an uppercase, which is
the situation for
026B ; LATIN ; Atomic-no-uppercase # L& (ɫ) LATIN SMALL LETTER L
WITH MIDDLE TILDE
In either case there is prima facie reason for some level of scrutiny, if
the goal to be initially conservative in repertoire. (In this, I am not
necessarily advocating one or another approach; simply trying to gather
information so that informed judgments can be made.)
The WORD CHARACTERS ADDED list also need review, especially for the
MODIFIER LETTERs, to see which if any of those (if any) are used in modern
languages. I am hoping to get information back like the following
(hypothetical):
--- Referring to the characters at the end of http://unicode.org/reports/tr36/draft/idn-chars.txt 0264 ; Ll # (ɤ) LATIN SMALL LETTER RAMS HORN needs to be included. It is lowercase in form only; it is used caselessly, which explains the lack of uppercase. It is used for Wičita, a modern language spoken in Kansazia, with several weekly newspapers (eg http://qwery-news.kq) 018C ; LATIN ; Atomic # L& (ƌ) LATIN SMALL LETTER D WITH TOPBAR is only used for Northeastern Squamish. There are no regular modern publications using this character, outside of articles on linguistics. 026E ; Ll # (ɮ) LATIN SMALL LETTER LEZH 0270 ; Ll # (ɰ) LATIN SMALL LETTER TURNED M WITH LONG LEG 02C2 ; word-chars # Sk (˂) MODIFIER LETTER LEFT ARROWHEAD 02C3 ; word-chars # Sk (˃) MODIFIER LETTER RIGHT ARROWHEAD 02C4 ; word-chars # Sk (˄) MODIFIER LETTER UP ARROWHEAD 02C5 ; word-chars # Sk (˅) MODIFIER LETTER DOWN ARROWHEAD 02D2 ; word-chars # Sk (˒) MODIFIER LETTER CENTRED RIGHT HALF RING are only used in the Danish Gua'uld system of phonetic transcription, not for any modern language. Mark ----- Original Message ----- From: "Peter Kirk" <peterkirk@qaya.org> To: "Doug Ewell" <dewell@adelphia.net> Cc: "Unicode Mailing List" <unicode@unicode.org> Sent: Sunday, April 03, 2005 12:15 Subject: Re: Security Issues > On 03/04/2005 17:27, Doug Ewell wrote: > > > ... > > > >There's also a significant controversy surrounding the ability of some > >evil person to register "paypaɫ.com" or similar, using a letter like > >U+026B that most people in the world aren't aware exists, ... > > > > The standard should not pander to ignorance. Don't forget that there are > billions of Chinese, Indians etc who are not familiar even with our > basic ABC. > > >... and using it > >to dupe innocent consumers. People are running around screaming that > >internationalized domain names are evil for allowing these characters, > >and that Unicode is evil for including them in the first place. This > >"security" thread is an attempt to work out the best solution for all. > > > > > > > I see the point. But if we are going to allow U+0142 to support Polish, > and so to allow anyone to register "paypał.com", then there is not much > difference allowing them to use "paypaɫ.com", with U+026B. Perhaps > U+0142 and U+026B can be listed as lookalikes. Actually, does anyone > want U+026B? This is not a click. Perhaps you were thinking of U+01C2. > > -- > Peter Kirk > peter@qaya.org (personal) > peterkirk@qaya.org (work) > http://www.qaya.org/ > > > > -- > No virus found in this outgoing message. > Checked by AVG Anti-Virus. > Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 01/04/2005 > > > >
This archive was generated by hypermail 2.1.5 : Sun Apr 03 2005 - 14:17:57 CST