CLDR ExemplarCharacters Data for Identity Management Data Validation Rules

From: Thierry Moreau (thierry.moreau@connotech.com)
Date: Thu May 07 2009 - 15:08:22 CDT

Next message: Doug Ewell: "Re: Rendering of Candrabindhu & Visarga Dual Combination in Indic Scripts"

Previous message: Vinodh Rajan: "Re: Rendering of Candrabindhu & Visarga Dual Combination in Indic Scripts"
Next in thread: Thierry Moreau: "Re: CLDR ExemplarCharacters Data for Identity Management Data Validation Rules"
Reply: Thierry Moreau: "Re: CLDR ExemplarCharacters Data for Identity Management Data Validation Rules"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This post is a general question about the design of validation logic for
an identity management application.

This is somehow related to IDN validity rules, but with slightly
different application requirements.

In UTR#36 (Unicode Security Considerations) Annex G (Language-Based
Security) was published a few days after CLDDR version 1.4 was released,
in 2006-07. The text of this annex recommends the use of Unicode scripts
as a basis for name validation rules, and recommends writing systems
instead of languages as a refined strategy.

In the meantime, the CLDR project moved to version 1.6 (and 1.6.1) and
improved "data on language and script usage" (presumably this covers
exemplarCharacters).

The main question is whether UTR#36 / Annex G advice *against* using
CLDR data for validation rules (e.g. for security-aware applications
e.g. where identity spoofing is a threat) has been revisited by someone.

So far, my investigations along these lines indicate that it should be
feasible to combine Unicode script information and CLDR
exemplarCharacters data with a lot of adjustments (e.g. to remove
historic or phonetic scripts) to come up with language-specific rules
for what is an acceptable identity in a given language (actually the
rules may apply to personal identification data elements such as place
of birth). Obviously, such validation applies to normalized strings.

Any comment or suggestion?

Thanks in advance.

-- 
- Thierry Moreau
CONNOTECH Experts-conseils inc.
9130 Place de Montgolfier
Montreal, Qc
Canada   H2M 2A1
Tel.: (514)385-5691
Fax:  (514)385-5900
web site: http://www.connotech.com
e-mail: thierry.moreau@connotech.com

Next message: Doug Ewell: "Re: Rendering of Candrabindhu & Visarga Dual Combination in Indic Scripts"
Previous message: Vinodh Rajan: "Re: Rendering of Candrabindhu & Visarga Dual Combination in Indic Scripts"
Next in thread: Thierry Moreau: "Re: CLDR ExemplarCharacters Data for Identity Management Data Validation Rules"
Reply: Thierry Moreau: "Re: CLDR ExemplarCharacters Data for Identity Management Data Validation Rules"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu May 07 2009 - 15:17:42 CDT