Re: Performance implications of GRegex structure



[Mark, I apologize, I accidentally sent it to you in private]

mark mark mielke cc wrote:
On Fri, Mar 16, 2007 at 09:15:37PM -0500, Yevgen Muntyan wrote:
I do understand that a separate match object is a good idea.
But "separate match object in C API is a good idea" is questionable.
While thread-safety is important, it doesn't sound feasible a single
GRegex object will be used from different threads to match something
in *many* cases. Maybe it makes sense to add thread safety
in some other way? The single-object version is certainly more
convenient than a version with a separate match object.
By the way, I don't know about Java, but having re.match()
return an object is very often gets in your way in Python (for
different reasons but it does say something about "it's done
so in Python").

It looks to me like you are suggesting the worst of all worlds.
Not thread-safe, not scaleable, and not simple.

I am suggesting something which is currently used in real code.
Simple, nice, and working. *If* it's not as good as it should be,
*then* it should be changed.

If you want simple - give up on GRegexp altogether. Try something like:

    if (g_string_regexp_match(s, pattern))
        ...

If it happens to do some sort of internal caching - great. If not?
At least it is simple. For Java, this maps to:

    if (string.matches(pattern))
        ...

Should I say "if you want cool - give up on C altogether"? I guess
not. I mean, it's not like I want simplicity at all costs. No need to
provide fancy-shmancy java stuff. It's a real serious question: what
would be the uses of GRegex, where it would be convenient
to have separate Match structure and where it would not.

Having a matcher object serves more purposes than just thread
scaleability. What if I wish to walk through the string, finding
each match, processing each match as it is found?
Possible.
 Why should I
have to search the entire search before I can display the first
match?
You can't do the contrary - find all matches and display them.
(I guess Marco should know better, I've never done stuff like
this)
In Perl, this functionality is available as:

    while ($scalar =~ /(pattern)/g) {
        ... each match ...
    }

With a Matcher object, the same can be accomplished in a thread-safe
manner.
Could you show how it would be done (i.e. show C code)? And
what's "Matcher"? Is it something that performs matching, or
is it a result of search (match)? I guess it could make sense to
collect all results after searching repeatedly, but it doesn't seem
to be what you are talking about.

Yevgen





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]