Re: Changes to the GLib main loop

From: Sebastian Wilhelmi <wilhelmi ira uka de>
To: Owen Taylor <otaylor redhat com>
Cc: gtk-devel-list gnome org
Subject: Re: Changes to the GLib main loop
Date: Tue, 19 Jun 2001 16:53:19 +0200
Hi Owen,

> I'm glad you had a chance to try out the new main loop code in
> detail. I basically wrote the multiple main loop support without
> a real live example in mind, so some testing in practice is definitely
> a good thing.
> 
> In concept, I'm not opposed to the idea of allowing GMainContext's
> not to be restricted to running in a single thread; however,
> there are some definite implementation challenges there that I'm
> not sure that your code addresses.
> 
> > For multithreaded programs there is the added advantage, that you can
> > use the usual idioms in other threads as well (though they will do
> > nothing then, just wait, whenever may_block is true.):
> >
> > while (g_main_context_pending(context))
> >   g_main_context_iteration(context, may_block)
> 
> But the way you've written it, this doesn't do at all what people
> expect.
> 
> What people will try to use this for is:
> 
>  - Main thread is blocking in a long computation
>  - To continue processing events, iterate the main loop from
>    a different thread.

I don't think the programmer expects that. I think

while (g_main_context_pending(context))
  g_main_context_iteration(context, may_block)

is a idiom for saying "if there are any pending updates for e.g. GTk windows,
do them, before I continue a long computation". And indeed these updates are
done, but in another thread, so FALSE is returned. and the loop is not
entered. Of course the other thread might block itself in a long computation,
but again it might call the above two lines from time to time and it still
works. 

Whenever some thread "owns" the context, I think there is no sensible way to
poll and dispatch on that context from another thread, e.g. trying to continue
processing events while another thread blocks while dispatching that events.

> But that doesn't work since g_main_context_iteration() will just wait
> for the main thread. So allowing someone to write this is just
> misleading them.

g_main_context_iteration() is blocking, but g_main_context_pending is
returning FALSE. It's just as it would happen in a single threaded program,
only that g_main_context_iteration wouldn't do anything, as everything is
already done in another thread, not very surprising, I think.

> If we actually want to allow people to call g_main_context_iteration()
> from multiple threads, then we have to implement it so that if one
> thread is stuck in dispatch() then another thread will recursively
> iterate; if the first thread is however, blocking in poll(), or is
> doing anything other than dispatching, then it has to simply
> wait.
> 
> This is conceptually possible, but quite a pain to implement, which is
> one reason I didn't do it. (Remember, the context lock is _not_ held
> over prepare/query/check.) There are also questions of what happens to
> outstanding dispatches when multiple threads are iterating on the main
> loop. A recursive invocation from a single loop simply discards
> outstanding dispatches and starts over, but this clearly isn't the
> right behavior for multiple threads.
> 
> Possibly the right behavior is for the threads to dispatch requests
> round robin, and for the first free thread to start a new cycle,
> ignoring sources currently being dispatched. This would need
> _very_ careful definition, and the combination of this and
> recursion within threads presents a highly complex situation.

That is true, it sound like a implementation nightmare with no real benefit.
If the user want's to dispatch certain events to other threads, s/he should
use GAsyncQueue or better yet GThreadPool, which is just for doing that.

> The context->owner mechanism presents some problems with GMainLoop.
> Consider what happens when:
> 
>  - Thread A calls g_main_context_iteration()
>  - Before that returns, thread B calls g_main_run() on the same thread
>  - Thread A returns and starts a long computation

No problem, Thread A calls g_main_context_iteration and set owner to A. thread
B finds A owning the contexts waits, A is ready and signals B to be ready, B
finds loop->context->owner == NULL and then simply takes the control of the
main loop. Neat, eh?

> Becaus Thread A was the owner when the main loop started, thread B
> will continue waiting forever for some other thread to run the
> main loop. Since creation of a context doesn't identify the owner,
> we introduce the potential of race conditions for ownership of
> a thread.

As outlined, I don't think so.
 
> Also, using g_main_loop_run() to wait for, say, a modal dialog,
> in a second thread has become _much_ less efficient since you
> are waking up the second thread via g_cond_broadcast() on
> _every_ iteration. In face, I'd consider this to be an
> unacceptable performance hit. The reason why each GMainLoop
> currently has a separate condition variable is to avoid this.

Ok, but that could be optimized away by also telling g_main_context_iterate,
whether it was called from g_main_loop_run or not only signalling other
threads, when it was not run from g_main_loop_run. For calls not done from
g_main_loop_run it would still need to be called, because of the scenario
outlined above (the thread A and B thingy)

> A final general consideration with your changes is that you
> have effectively made the public functions:
> 
>   gboolean g_main_context_prepare  (GMainContext *context,
>                                     gint         *priority);
>   gint     g_main_context_query    (GMainContext *context,
>                                     gint          max_priority,
>                                     gint         *timeout,
>                                     GPollFD      *fds,
>                                     gint          n_fds);
>   gint     g_main_context_check    (GMainContext *context,
>                                     gint          max_priority,
>                                     GPollFD      *fds,
>                                     gint          n_fds);
>   void     g_main_context_dispatch (GMainContext *context);
> 
> Not usuable, because g_main_context_iterate() manipulates
> the context structure directly in ways that external code
> can't. If you look at the current implementation of
> g_main_context_iterate(), you'll see that the only direct
> use of the GMainContext structure is to cache the GPollFD
> array.

No, that's a bogus argument, I didn't change those functions. They are as
usable as before and still you can easily program g_main_context_iterate as in
gmain.c. You only would need to hold the owner, cond and need_broadcast
members in your own structure, as before for cached_poll_array and
cached_poll_array_size. Of course mixing own mainloops with the GLib one is a
delicate thing, but that has been the case with the old solution as well.
 
> One of things I wanted to achieve going into my main loop changes
> for GLib-2.0 was that people would be able to integrate the GLib
> main loop with other main loops that worked in different
> ways. This is the reason for making these functions public,
> and I think you're changes break this ability.

I don't think so, please elaborate.

> > - g_main_context_iterate handles cached_poll_array more inteligent.
> >   The old solution looked quite doubios to me.
> 
> I'm not sure your handling is "more intelligent". It is certainly
> simpler. My initial reasoning for the complexity was that the
> fd array needed to remain the same over the entire
> prepare/query/check() sequence; the code was meant to ensure that.

What I mean with doubios is, that whenever you loop through the do {} while
loop for the second time, you have a memory leak, so the while (new_nfds >
nfds) should rather be a g_assert (new_nfds > nfds), but that could fail, as
you do not have the lock, while calling g_main_context_query, and add_poll
could intercept the whole thing.

> From a brief look, it appears that since g_main_iterate() can
> never recurse until the dispatch() phase, your change is probably
> OK, assumming you fix up g_main_context_add_poll().
> 
> > - GStaticMutex is used instead of GMutex. This makes it possible to
> >   initialize a GMainContext before calling g_thread_init ();
> 
> Didn't we say that g_thread_init() _must_ be called before any
> other GLib functions. Using a GStaticMutext for this reason
> strikes me as a bit dubious...

Until now we didn't require this, on the contrary we made sure to let the main
thread inherit all thread private data set before g_thread_init () and so on.
In general some libraries might to initialize the thread system, which the
main program doen't know about or something like that, so I think, that
requiring this isn't good. Also you save a malloc, and an indirection, as the
mutex is in situ.

> > - Adapted and Extended tests/mainloop-test to new setup. Works ok
> >   here. The failures in the old versions seems to be caused by
> >   incorrect locking. I now use one big lock instead of the two before
> >   and now it works like a charm.
> 
> Hmm, I'd rather know _what_ the incorrect locking was. Maybe I'll
> take another look at it later.

I didn't investigate this further then. I would also very much like to know,
why it failed, but until now I didn't find it. I agree, that it's not very
nice to just say "Now it works".

> > All in all I like this patch very much.... ;-) and I hope you do as
> > well. Really this patch (at least the API, I proposed) really would
> > make life for ORBit/ORBit-mt much easier.
> 
> Hmm, well, as you can see above, I don't _completely_ like
> it. :-)

That's what I feared ;-)

> Your proposed changes to the interface certainly clean it up some,
> but I'm worried that the result behaves in unexpected ways.
> 
> We basically have two courses we could take:
> 
>  * Try and figure out how to implement your interface so that
>    multiple threads can actually iterate the same thread.
> 
>    This probably is the ideally preferrable course, but it has
>    always struck me as very hard, and the resulting behavior
>    might still be confusing.

I think, with the above explanations, that my implementation does that (I
would add the optimization to not wake up the waiting threads from
g_main_context_iterate, when called from g_main_loop_run)

> > Another thing, that really makes me wonder are the lines
> >
> >   'if (pollrec->fd->events)'
> >
> > in g_main_context_qurey and g_main_context_check. Can't this be
> > checked in g_main_context_add_poll, or do we allow fd's with empty
> > event fields to be added. Actually this would indicate a programming
> > mistake, doesn't it?
> 
> I think it is mainly a case of allowing all cases, including the
> NULL one for the sake of completeness. I'm not sure a 0 field
> here is actually a programming mistake... it could occur
> theoretically through simple laziness.

I would still sort it out in g_main_context_add_poll, not nessecarily with
g_return_if_fail, but why not, actually?

Bye,
Sebastian

-- 
Sebastian Wilhelmi
mailto:wilhelmi ira uka de
http://goethe.ira.uka.de/~wilhelmi
Follow-Ups:
- Re: Changes to the GLib main loop [ g_main_iterate() ]
  - From: Owen Taylor
- Re: Changes to the GLib main loop [ GMainLoop ]
  - From: Owen Taylor
- Re: Changes to the GLib main loop [ remainder ]
  - From: Owen Taylor
References:
- Changes to the GLib main loop
  - From: Sebastian Wilhelmi
- Re: Changes to the GLib main loop
  - From: Owen Taylor
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]