Re: next round: glib thread safety proposal




Sebastian Wilhelmi <wilhelmi@ira.uka.de> writes:

> > > - it was a bad idea to use the duration as a relative time. we have to
> > > use an absolute ending time for the wait. this however doesn't fit to
> > > well into the current setup. I added a type GAbsTime, which basically is
> > > struct tm, but as this type is not to be found on win32, I added a new
> > > type. I actually do also think that to use this would be better for the
> > > new glib mainloop too. we could then have some convienience functions to
> > > manipulate them, i.e. get the current time, add n microseconds, subtract
> > > another AbsTime and so on.
> > 
> > Could you expand a bit more on why relative times are bad for
> > timed waits?
> 
> If you are doing a timed wait on an GCond, and you are signaled that
> the condition have come true, the GCond wait is not left, before you
> reentered the mutex. but you're not garanteed to be the next thread to
> enter the mutex, so by the time you enter, the condition might already be
> false (condition here doesn't mean GCond, but the condition itself, i.e.
> i==0 ) and thus you have to wait again for the rest of the time. If you
> would be using relative times, you couldn't do that properly, except when
> you would provide a pointer to a long value and would leave the rest time
> behind there. but that would not be accurate, if the above would happen
> frequently. So technically it is best to use a GAbsTime, we could even
> make them be used as arguments and return values (instead of pointers to
> them) to make them easier to use (after all, they are as big as gint64),
> and by now all c compilers should support this, or am i wrong?
>  
> > In my new main loop, I already have GTimeVal (== GAbsTime) and
> > g_get_current_time(), used for absolute times.
> >
> > Millisecond intervals are still used for timeout specifications,
> > and for the timeout on the central poll. (For compatibility, and for
> > convenience, and because the main loop cannot make guarantees about
> > time of delivery of its timeouts in any case, and because poll() only
> > has millisecond specification for the timeout.)
>
> I do not know, if the above condition wait scenario would apply to this
> too, so it might or might not apply to glib main loop. You'll know best.

It definitly doesn't apply to timeouts added with g_timeout_add(),
since they can never be called prematurely. The scheduling of
GSource's is some what analogous to the timed mutex waits,
but since the absolute current time is passed to the source before
each poll(), a source will have no trouble figuring out how
much longer it needs to wait.

> > > - I added GStaticMutex, that can be used after being initialized
> > > statically. Much more convenient, than having to do dynamic
> > > initialization on your own. This however builds on GMutex, and as such
> > > is slower (but just one non-zero-test after first time use)
> > 
> > Hmmm, I think the following scenario can occur on SMP machines:
> > 
> >  1) Thread A finds a uninitialized mutex, and calls
> >     g_static_mutex_init()
> > 
> >  2) Thread A initializes the mutex (in cache)
> > 
> >  3) Thread A writes the pointer to the mutex into main memory
> > 
> >  4) Thread B finds the pointer, follows it to the (as yet
> >     unitialized) contents and tries to use them.
> > 
> >  5) Thread A writes the contents of the mutex into main memory.
> > 
> > As I understand it, to guarantee ordering of writes from one
> > processor, as seen from another processor, you need mutexes
> > on both sides. :-(
> 
> no, the cache coherence protocol to be found on all such systems is
> supposed to prevent this. I asked someone from our institute with detailed
> knowlege in this topic and he said, that we could rely on such not happen. 

There was a long thread on comp.programming.threads about
this some months ago:

 http://www.dejanews.com/dnquery.xp?search=thread&recnum=%3c1998\
 May28.082712@bose.com%3e%231/1&svcclass=dnserver

Now, this is wisdom from Usenet (pearls from swine, so to
speak), and I don't put a whole lot of faith in it. But the
general conclusion of that thread seems to be that on some
MP architectures, the approach may not be safe.

Re-reading the thread, I see, that the problem is a little
different than I presented it above. Essentially, it is:

 1) Thread B reads in some other data, which happens to
    be in the same cache line as the contents of the mutex.

 2) Thread A writes the contents of the mutex and the
    pointer to the mutex.

 3) Thread B reads in the pointer to the mutex, and uses
    it to access the stale cache data for the contents
    of the mutex.

This is possible for memory models that allow out-of-order
reads.
 
> Could this go into glib before the 1.2 code freeze?
> 
> (My hurry is because I could really need this stuff for the multithraeded
> ORBit, that I'm currently working on, so at least you should keep calm ;-)

Yes, I'd like to have it in for 1.2, because it will
make it easier to maintain the current level of
thread safety for GTK+ while moving the main loop
to GLib, and because the Java people need the vfunc
mutex setting.

> To conclude: Three open issues still:
> 
> - what about the thread specific data stuff.
> 
> - what about the seperation of mutex and cond function into two classes,
> as 
>   Aaron suggested. I do personally not think, that this would help, as
> there 
>   is only one instance of that class anyway, and the user is actually not
>   supposed to touch this in normal operation. It's just for some special
>   cases (named mozilla ;-).

My opinion is that it isn't necessary. We can still extend both 
the mutex and the condvar interfaces, with a single vfunc block - 
they'll just end up a bit mixed together. (And we can't extend
the block without breaking compatibility in any case)
 
> - what about the GAbsTime, or whatever you name it for your purpose, can 
>   they be uses 'by value'. It would make programming much more simple and 
>   only very slightly slower.

This is a style issue, and I don't have much of an opinion
on it. I think passing and returning structures (of any size)
by value is portable, though I don't know about the
performance implications. If I was desigining the interfaces,

I'd probably tend to pass by reference, but mostly because
structures aren't passed by value anywhere in GLIB or
GTK+ currently.

Regards,
                                        Owen



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]