Re: Performance implications of GRegex structure
- From: "Gustavo J. A. M. Carneiro" <gjc inescporto pt>
- To: Yevgen Muntyan <muntyan tamu edu>
- Cc: Owen Taylor <otaylor redhat com>, GTK+ development mailing list <gtk-devel-list gnome org>
- Subject: Re: Performance implications of GRegex structure
- Date: Sun, 18 Mar 2007 14:16:35 +0000
I can't resist to not state my opining on this :P
I think it's OK to have a single GRegex object, with no separate match
or matcher, IF g_regex_copy is basically a lightweight copy.
I think this matches well with the rest of the GLib APIs wrt. thread
safety. None of the other GLib data structures are thread-safe. E.g.
you can't share a GList between threads, you have to protect it by a
mutex or have one copy for each thread. So why should GRegex follow a
And there is not doubt that, with C's manual memory management,
managing a single object is easier than managing multiple objects. And
for language bindings, wrapping one object is easier than wrapping
Just my .02€,
 Without looking at the code, I think g_regex_copy can easily become
lightweight by internally splitting the GRegex into a shared
read-only/immutable part and a non-shared state part, wherein
g_regex_copy would incref the shared part and create a new non-shared
part. Any attempt to modify a property that belongs to the immutable
part would trigger a new copy of that part (copy-on-write).
 Well, maybe except GQueue, but it was designed specifically for
threading, else it is more or less a GList.
On Sáb, 2007-03-17 at 18:37 -0500, Yevgen Muntyan wrote:
> Owen Taylor wrote:
> > On Sat, 2007-03-17 at 16:08 -0500, Yevgen Muntyan wrote:
> >> Yevgen Muntyan wrote:
> >>> [snip]
> >>> To me here the only good argument in favor of separate Match objects is
> >>> multi-thread uses.
> >>> Simply because we already have Match object, just hidden. If the best
> >>> way to fix GRegex
> >>> for multi-threading is a separate match object, then it should be a
> >>> separate match object.
> >> In fact it's not a solution, right? Since if it's separate Match
> >> structure, then Regex still needs to keep state.
> >> So, the solution is to rename some stuff to make GRegex be
> >> a GRegexExp or something, and move the actual functionality
> >> to some new GMatcher, i.e. not change anything conceptually but
> >> explicitly separate Pattern and Matcher. Did I get it right?
> > Yes, I think you've understood what I was talking about with a
> > matcher object ... almost all the methods in GRegex currently other
> > than g_regex_new()/g_regex_optimize() are conceptually matcher methods.
> > I don't have any objection to a matcher object with state; what I don't
> > like is binding it together with the pattern into a single indivisible
> > object.
> What I was arguing to (if you ignore "don't change it period" part)
> was creating new match objects every time you perform a match
> (it's what's done in Python). Basically I don't want every match()
> method to get me new allocated structure which has to be freed.
> And given it wouldn't work anyway, I was arguing to something
> which wouldn't work anyway :)
> Making cool new API which would be nice is certainly not a bad
> One thing should be taken care of: how all those things will
> be copied/referenced. The language bindings concern led
> to this silly g_regex_copy(); so we can get same funny
> thing when someone says "not bindings-friendly" about
> new API.
> Perhaps making Matcher and Regex ref-countable (perhaps
> internally for Regex) wouldn't be bad, would it?
> It would be great if concerned people  commented about it in
>  Owen :)
> gtk-devel-list mailing list
> gtk-devel-list gnome org
Gustavo J. A. M. Carneiro
<gjc inescporto pt> <gustavo users sourceforge net>
The universe is always one step beyond logic
] [Thread Prev