PROP: use of Perl Compatible Regular Expressions (long, sorry)

Hi all!

Attached is a patch against the current cvs tree which, following a
proposal by Brian Stafford, enables the use of Perl Compatible Regular
Expressions (PCRE).


During the last weeks (mainly connected with the detection of URL's) we
found that there are many slightly different implementations for the posix
regcomp/regexec stuff around. E.g., the construction in spell-check.c

	const gchar *new_word_regex = "\\<[[:alpha:]']*\\>";

will *not* work for everybody as some libc versions do not recognise the
"\<" and "\>" patterns.

PCRE offers Perl regex's and should therefore be portable. It has a posix
compatible api, which means that we could benefit from the "standard" lib
without the need to rewrite the code. The home page of the PCRE lib (with
links to download pages) is

The list of patched files is rather long, but most changes are only for
detecting the availibility of the library:



I made several tests with this lib on Linux/Intel and Linux/PowerPC and
could not see problems yet. The performance is not very different from the
posix stuff. Some caution is necessary as pcre can return empty matches.
E.g. rewriting the expression above to "\\b[[:alpha:]']*\\b" may return an
empty string (which is correct). The simple solution is to use
"\\b[[:alpha:]']+\\b" instead.

Note that PCRE do not resolve the problems with detecting the national
characters ("Umlauts") in the [:alpha:] class as it relies on libc.

It would be *really* great if we could include pcre support into balsa as
this would simplfy the detection of different url's (yes, I'm *still*
working on that... ;-)).

Any opinions?

Cheers, Albrecht.

  Dr. Albrecht Dreß  -  Monschauer Straße 22  -  D-53121 Bonn (Germany)
      Phone (+49) 228 6199571  -  E-Mail


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]