Re: regular expressions

From: Mark Leisher (mleisher@crl.nmsu.edu)
Date: Wed Jan 29 1997 - 16:53:58 EST


    Rick> A couple of questions for any Unix-heads out there... 1. Does
    Rick> anyone have a "freely distributable" regular expression parser
    Rick> and/or text finding routine with these characteristics: A. It eats
    Rick> Unicode text or UTF-8 on input B. It uses the typical Unixoid
    Rick> regular expression syntax

    Rick> 2. Has anyone given any serious thought to extensions of said
    Rick> Unixoid regular expression syntax to handle non-English alphabets
    Rick> used as "ranges" for pattern matching?

Timing IS everything! I'm finally finishing off mine and a version of it
(missing some of our more interesting features) will be made freely available
some time in the next couple of weeks (depending on my homework load). It is
a little different than the usual RE packages in that it compiles REs directly
to minimal DFAs.

The version to be distributed will be limited to handling UCS2 strings, but
can be very easily adapted to handle UTF8. It has most of the usual "Unixoid"
features except a few I haven't gotten around to yet (e.g. the interval
repetition operator {N,M}).
-----------------------------------------------------------------------------
mleisher@crl.nmsu.edu
Mark Leisher "A designer knows he has achieved perfection
Computing Research Lab not when there is nothing left to add, but
New Mexico State University when there is nothing left to take away."
Box 30001, Dept. 3CRL -- Antoine de Saint-Exup'ery
Las Cruces, NM 88003



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT