Thread cancellation and joining (GThreads)



Hi there,


I am building an application that does asynchrone calls with a CORBA
server. At both sides we are using ORBit. At the client-side the library
version is ORBit-2, at the server-side the library version is ORBit-1.
We managed to get that working already.

It's being asynchrone by using one thread that travels a queue that can
grow during the application-run. The queue-items have a priority-field
which is used to search the next most urgent call in the queue right
after the last job has finished.

This queue-mechanism is working. I am already doing multiple asynchrone
CORBA calls which will return for example image-data, which I use
between a gdk_threads_enter() and a gdk_threads_leave() to fill a
GtkDrawingArea with that image-data. This is ultra-stable and it just,
well ... works.

However...

We simulated a crash at the server-side (the CORBA-server). When the
CORBA-server crashes, the thread is waiting for the CORBA-server to
close the connection or .. well, it's waiting in the CORBA-function-
call. The result is that my queue stops processing (it actually waits)
until the server is brought completely down (killed) and/or restarted.

I don't want my queue-thread to wait in that call. If the CORBA-server
fails, I want to give my user the power to cancel the job. Even perhaps
to abort the whole queue-thread-execution and at a later time, restart
the queue-thread (or make it automatically restart and pick up where it
left).

However...

So far I failed to find a g_thread_cancel ()-like API call. I could use
some flag which would stop the thread at the first next call, and in
fact I am doing that to allow the queue-thread to shut-down properly and
to make my main-thread join it during the cleanup-phase of my
application (so while the application is shutting down). However, the
thread is not returning the CORBA-function that I am calling. It
actually hangs in that function ... I guess this is what they call a
blocking operation. Until completed, the function will not return. On
failure, it will only return when the server sends some failure-signal
or whatever. The server, which has crashed (simulated crash), will never
send that signal. Or perhaps the ORBit-library will abort on some
timeout, but I don't want my user to wait for such a timeout.

So basically, I want to forcefully abort the thread-execution.

Is that possible? Or is there a better method for my specific situation?

I have been thinking about launching a new thread per call. Since the
calls in the queue cannot be executed in parallel (by the server, this
is a fact which cannot be changed -the server is controlling a
mechanical device which just can't, really can't, do things in
parallel), I would have to make the queue-loop thread join the launched
thread and by that making it wait for termination in order to select the
next job to get launched. Which is fine for me (I could program it like
that). I just don't see a solution if I do it like that either, I will
face the exact same problem .. but will have yet another thread to
manage :) (but which I can perhaps forcefully kill more easily, of which
how to do that -> I don't know == my question).
 
Sidenote: When I do abort the thread, I want to have the opportunity to
cleanup allocated memory in that thread. Else aborting such a thread
would cause some memory to get leaked, I guess. So joining the thread
should still be possible, so when the thread does get cancelled, the
g_thread_exit ()-call should be used for that (else will the join fail,
I guess -when I kill the specific thread using a debugger, the join
hangs-).


-- 
Philip Van Hoof, Software Developer @ Cronos
home: me at freax dot org
gnome: pvanhoof at gnome dot org
work: Philip dot VanHoof at Cronos dot Be
http://www.freax.be, http://www.freax.eu.org



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]