Jonathan> I believe that a Unicode regular expression should have an
Jonathan> option to recognize letters and ignore diacritics. For example,
Jonathan> to look for vis-a-vis whether the a is accented or not, or to
Jonathan> recognize a Hebrew word irrespective of the way it is pointed.
We had to have this feature for our Arabic morphological analyzer, so it is
already available.
-----------------------------------------------------------------------------
mleisher@crl.nmsu.edu
Mark Leisher "A designer knows he has achieved perfection
Computing Research Lab not when there is nothing left to add, but
New Mexico State University when there is nothing left to take away."
Box 30001, Dept. 3CRL -- Antoine de Saint-Exup'ery
Las Cruces, NM 88003
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT