Re: Performance implications of GRegex structure



On Thu, 2007-03-15 at 10:38 -0400, Morten Welinder wrote:
> [Re PCRE]
> 
> > (There is no match[er] object here, but the equivalent is in all the in
> > and out parameters ...)
> 
> Is it?  If PCRE is as glibc, there is lots of state in the compiled expression
> and you cannot use it threaded.  However, once the match call is done,
> another thread can use the compiled regexp.

PCRE has no relation to glibc, and the man page says:

     The  compiled form of a regular expression is not altered during matching, 
     so the same compiled pattern can safely be used by several threads at once.

> > Neither is very appealing to me as a coder, though I could be convinced
> > that the second [==re-compile] is OK by suitable performance timings. Do we
> > have such numbers?
> 
> It's hard to see what kind of numbers would make sense to use as an argument
> here.  The numbers will depend heavily (orders of magnitude) on the regexps
> and the data.

Well, I could imagine (maybe, barely) that someone could show me numbers
that showed that with a variety of long and complicated regular
expressions, compiling them was still 10x as fast as matching them
against very short strings.

But in general, yes, part of my concern is that there are situations
where you are going to matching the same regular expression against
thousands of strings, and in that situation, unless compilation is very,
very, fast, the need to repeatedly recompile will inevitably produce
measurable overhead.
					- Owen

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]