Re: Performance implications of GRegex structure



  I can't resist to not state my opining on this :P

  I think it's OK to have a single GRegex object, with no separate match
or matcher, IF g_regex_copy is basically a lightweight copy[1].

  I think this matches well with the rest of the GLib APIs wrt. thread
safety.  None[2] of the other GLib data structures are thread-safe. E.g.
you can't share a GList between threads, you have to protect it by a
mutex or have one copy for each thread.  So why should GRegex follow a
different pattern?

  And there is not doubt that, with C's manual memory management,
managing a single object is easier than managing multiple objects.  And
for language bindings, wrapping one object is easier than wrapping
two :P

  Just my .02€,

[1]  Without looking at the code, I think g_regex_copy can easily become
lightweight by internally splitting the GRegex into a shared
read-only/immutable part and a non-shared state part, wherein
g_regex_copy would incref the shared part and create a new non-shared
part.  Any attempt to modify a property that belongs to the immutable
part would trigger a new copy of that part (copy-on-write).

[2] Well, maybe except GQueue, but it was designed specifically for
threading, else it is more or less a GList.

On Sáb, 2007-03-17 at 18:37 -0500, Yevgen Muntyan wrote:
> Owen Taylor wrote:
> > On Sat, 2007-03-17 at 16:08 -0500, Yevgen Muntyan wrote:
> >   
> >> Yevgen Muntyan wrote:
> >>     
> >>> [snip]
> >>> To me here the only good argument in favor of separate Match objects is 
> >>> multi-thread uses.
> >>> Simply because we already have Match object, just hidden. If the best 
> >>> way to fix GRegex
> >>> for multi-threading is a separate match object, then it should be a 
> >>> separate match object.
> >>>   
> >>>       
> >> In fact it's not a solution, right? Since if it's separate Match
> >> structure, then Regex still needs to keep state.
> >> So, the solution is to rename some stuff to make GRegex be
> >> a GRegexExp or something, and move the actual functionality
> >> to some new GMatcher, i.e. not change anything conceptually but
> >> explicitly separate Pattern and Matcher. Did I get it right?
> >>     
> >
> > Yes, I think you've understood what I was talking about with a
> > matcher object ... almost all  the methods in GRegex currently other
> > than g_regex_new()/g_regex_optimize() are conceptually matcher methods.
> >
> > I don't have any objection to a matcher object with state; what I don't
> >   
> > like is binding it together with the pattern into a single indivisible
> > object.
> >   
> What I was arguing to (if you ignore "don't change it period" part)
> was creating new match objects every time you perform a match
> (it's what's done in Python). Basically I don't want every match()
> method to get me new allocated structure which has to be freed.
> And given it wouldn't work anyway, I was arguing to something
> which wouldn't work anyway :)
> Making cool new API which would be nice is certainly not a bad
> thing.
> 
> One thing should be taken care of: how all those things will
> be copied/referenced. The language bindings concern led
> to this silly g_regex_copy(); so we can get same funny
> thing when someone says "not bindings-friendly" about
> new API.
> Perhaps making Matcher and Regex ref-countable (perhaps
> internally for Regex) wouldn't be bad, would it?
> 
> It would be great if concerned people [1] commented  about it in
> http://bugzilla.gnome.org/show_bug.cgi?id=419368
> 
> Yevgen
> 
> [1] Owen :)
> 
> _______________________________________________
> gtk-devel-list mailing list
> gtk-devel-list gnome org
> http://mail.gnome.org/mailman/listinfo/gtk-devel-list
-- 
Gustavo J. A. M. Carneiro
<gjc inescporto pt> <gustavo users sourceforge net>
The universe is always one step beyond logic




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]