From: Thierry Moreau (thierry.moreau@connotech.com)
Date: Thu May 07 2009 - 15:08:22 CDT
This post is a general question about the design of validation logic for 
an identity management application.
This is somehow related to IDN validity rules, but with slightly 
different application requirements.
In UTR#36 (Unicode Security Considerations) Annex G (Language-Based 
Security) was published a few days after CLDDR version 1.4 was released, 
in 2006-07. The text of this annex recommends the use of Unicode scripts 
as a basis for name validation rules, and recommends writing systems 
instead of languages as a refined strategy.
In the meantime, the CLDR project moved to version 1.6 (and 1.6.1) and 
improved "data on language and script usage" (presumably this covers 
exemplarCharacters).
The main question is whether UTR#36 / Annex G advice *against* using 
CLDR data for validation rules (e.g. for security-aware applications 
e.g. where identity spoofing is a threat) has been revisited by someone.
So far, my investigations along these lines indicate that it should be 
feasible to combine Unicode script information and CLDR 
exemplarCharacters data with a lot of adjustments (e.g. to remove 
historic or phonetic scripts) to come up with language-specific rules 
for what is an acceptable identity in a given language (actually the 
rules may apply to personal identification data elements such as place 
of birth). Obviously, such validation applies to normalized strings.
Any comment or suggestion?
Thanks in advance.
-- - Thierry Moreau CONNOTECH Experts-conseils inc. 9130 Place de Montgolfier Montreal, Qc Canada H2M 2A1 Tel.: (514)385-5691 Fax: (514)385-5900 web site: http://www.connotech.com e-mail: thierry.moreau@connotech.com
This archive was generated by hypermail 2.1.5 : Thu May 07 2009 - 15:17:42 CDT