From: Michael Maxwell (mmaxwell@casl.umd.edu)
Date: Tue Oct 02 2007 - 11:59:46 CST
I hesitate to jump into this thread, but:
Asmus Freytag wrote:
> Depending on how many accented letters a language uses,
> writing the equivalent expression manually can be both
> tedious and error-prone.
Aren't there two issues here that need to be separated:
(1) the issue of what some regex *means*, e.g. what ^X means, where X is some regex.
(2) the question of how easy it is to enter X on a computer.
It seems to me at least that there are lots of ways of doing (2), including keying stuff in at the command line, using a GUI like Bill Poser's, and/or having pre-compiled regexs. The latter might be user-defined (as with certain FSTs, like xfst or sfst), or they might be something that comes pre-defined with a regex-using program (like '[:space:]' is for Posix regex's), or they might be pre-compiled for different locales or for Unicode blocks. There might even be a future regex program that could pull the meaning of some constant regex off of a website like we do for XML schemas now.
I would hate to make the meaning of some regex counter-intuitive just because it's hard to type with today's software.
Mike Maxwell
CASL/ U Md
This archive was generated by hypermail 2.1.5 : Tue Oct 02 2007 - 12:02:06 CST