Re: [Tracker] [Patch] New CVS version



Le vendredi 25 aoÃt 2006 Ã 00:31 +0100, Jamie McCracken a Ãcrit :
Jamie McCracken wrote:
Laurent Aguerreche wrote:
Le mardi 22 aoÃt 2006 Ã 10:39 +0100, Jamie McCracken a Ãcrit :
Gergan Penkov wrote:
and trackerd does not exit, I haven't waited more than 5-10 minutes though.
Rats - I thought I sussed this one!

if it takes more then a few seconds to exit then its a deadlock on a 
mutex. I guess I should unlock them all as a precaution when exiting. 
Please try latest cvs again (I cannot replicate your problem mind you)
I found a case where it doesn't exit smoothly.

What I found:

in a thread, called thread1, tracker_log() is called to write "Watching
directory %s (total watches = %d)":
- g_print() to STDOUT, Ok;
- then,
  1/ lock()      <<====== important
  2/ open() on ~/.Tracker/tracker.log
- *BOOM*, user does a Ctrl+C, so this thread won't continue as
signal_handler() has been called and replaces prior code...
- now, tracker_log() is called to write "Received termination signal %d
so now exiting."
- g_print() to STDOUT, Ok;
- then,
  1/ lock()
  of course, it will wait for ever since this mutex is already locked
and won't be released...

At the same time, other threads will want to log something so they will
be locked.



Great thanks for that - it sure was a subtle one!

The right thing to do here is not to tracker_log anything until the 
second phase of the shutdown. (the resumption of code after the sig 
handler will allow the code to continue to unlock the mutex).


Have now altered CVS to fix this.

Also had a stab at improving the exiting (mostly guesswork) and 
increasing thread sync speed.

According to what I understood in trackerd.c, the idea is to use
signal_handler() as a catapult to run an independent and delayed
do_cleanup() and let signal_handler() finish ASAP to let the interrupted
thread to continue.

So in my opinion, "act.sa_flags = SA_NODEFER;" should be reverted to
"act.sa_flags = 0" in main(). This will block all the threads as long as
signal_handler() is running and will avoid any surprise!

But I don't see why you delay call to do_cleanup(). The deadlock that I
pointed was only due to a code never reached which isn't the case
anymore. Now I only see the segfault I explained in my previous mail
(i.e. one connection but two concurrent calls to mysql_query()).

Small things:
- a typo in tracker_log()?
  "so now shuting" => "so now shutting"?
- why tracker_db_clear_temp() is called before any mutex locking? IMHO
it should be called after mutexes for pooling or I don't understand its
goal!  :-)
- I think that a call to tracker_db_thread_end() is missing near the end
of do_cleanup().

Then, I wonder if it could be a good idea to make do_cleanup() not exit
the program but only quit the main Glib loop. This way no code would be
needed after g_main_loop_run() in main() to prevent some issues and I
prefer to see a program which exits in its main function rather than
somewhere after an exit(). I did something like that in the attached
piece in my previous mail.


Laurent.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]