Re: Locking again

On 2002.08.01 10:40:19 +0100 Pawel Salek wrote:
> Hi,
> I get lock-up with HEAD when the SSL certificate needs to be confirmed. Can 
> anybody reproduce it? Can anybody confirm that the attached patch fixes the 
> problem?

around the 14th i discussed this with peter (you should have the messages
on balsa-maint), iirc the certs callback gets called with gdk locked and
unlocked from diferent code paths so our goes at fixing it broken in one
of the cases. i'm attaching my message regarding it
peter proposed reversing the lock order on all code which might work.
the most imediate fix would to make imap_open suck again or adding wrapers
for use in each code path - argh

Carlos Morgado - chbm(at)chbm(dot)nu - -- gpgkey: 0x1FC57F0A FP:0A27 35D3 C448 3641 0573 6876 2A37 4BB2 1FC5 7F0A
Software is like sex; it's better when it's free. - Linus Torvalds

On Sun, Jul 14, 2002 at 12:16:52PM -0400, Peter Bloomfield wrote:
> On 2002.07.11 05:13 Carlos Morgado wrote:
> > 
> > hi, your fix to my deadlock seems to be working fine, can I commit it 
> > ?
> Carlos:
> I just updated my cvs HEAD branch: the fix seems to have metamorphosed 
> between my sending it and it being committed:
> $ cvs -z3 diff libbalsa/libbalsa.c
> Index: libbalsa/libbalsa.c
> ===================================================================
> RCS file: /cvs/gnome/balsa/libbalsa/libbalsa.c,v
> retrieving revision 1.47
> diff -u -r1.47 libbalsa.c
> --- libbalsa/libbalsa.c	11 Jul 2002 16:01:47 -0000	1.47
> +++ libbalsa/libbalsa.c	14 Jul 2002 15:04:09 -0000
> @@ -582,13 +582,13 @@
>       static pthread_mutex_t ask_cert_lock = PTHREAD_MUTEX_INITIALIZER;
>       AskCertData acd;
> -    libbalsa_unlock_mutt(); gdk_threads_leave();
>       pthread_mutex_lock(&ask_cert_lock);
> -    gdk_threads_enter(); libbalsa_lock_mutt();
>       pthread_cond_init(&acd.cond, NULL);
>       acd.cert = cert;
>       gtk_idle_add(ask_cert_idle, &acd);
> +    libbalsa_unlock_mutt(); gdk_threads_leave();
>       pthread_cond_wait(&acd.cond, &ask_cert_lock);
> +    gdk_threads_enter(); libbalsa_lock_mutt();
>            pthread_cond_destroy(&acd.cond);
>       pthread_mutex_unlock(&ask_cert_lock);
> In my setup, it's the cond_wait that needs to be executed without the 
> mutt and gdk locks, not the mutex_lock. However, I can see that if you 
> have multiple IMAP accounts, the lock might need the same treatment.

Bloody hell. I aplied the patch by hand, fudged it and it still worked for
me. uncany!
fortunatly it breaks with my home setup :)

> On reflection, I have serious doubts about this fix. Giving up the mutt 
> lock in the middle of making an IMAP connection seems like an 
> invitation to disaster: who knows what might have changed in mutt's 
> innards by the time the dialog has run, and the lock is reacquired?
you're right, as you might imagine i didn't even read the thing properly. 

with the original version i get
* 3 Thread 1026 (LWP 5549)  0x4011fba5 in __sigsuspend (set=0x4186bb4c)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
  2 Thread 2049 (LWP 5548)  0x401d73e7 in __poll (fds=0x8154474, nfds=1, 
    timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:63
  1 Thread 1024 (LWP 5543)  0x4011fba5 in __sigsuspend (set=0xbffff460)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:45

thread 3 comes from 
#3  0x080b243f in libmutt_ask_for_cert_acceptance (cert=0x81de0e0)
    at libbalsa.c:591
591	    pthread_cond_wait(&acd.cond, &ask_cert_lock);
#10 0x080ae366 in libbalsa_scanner_imap_dir (rnode=0x815841c, 
    server=0x815c3b0, path=0x815c408 "Mail/", subscribed=0, list_inbox=1, 
    depth=2, folder_handler=0x8082fbc <add_imap_folder>, 
    mailbox_handler=0x8082dec <add_imap_mailbox>) at folder-scanners.c:223

thread 1 comes from gtk_main  

problem is this
/* libmutt_ask_for_cert_acceptance:
   executed with GDK UNLOCKED. see mailbox_imap_open() and
is wrong, libbalsa_scanner_imap_dir locks threads and ask_for_cert
gets called from that code path. if you don't unlock gdk the
idle handler doesn't run and we deadlock. so we must drop gdk lock around
pthread_cond_wait. but then we may deadlock on libmutt lock. if we
drop the libmutt lock hell may break loose cause we're actually inside

we could just run ask_cert_real from libmutt_ask_cert but that
breaks cause we're not sure we have gdk_lock in libmutt_ask_cert and
we can't aquire the lock as it's a callback from openssl. we could add another
level of indirection ... argh

did i miss something or got this all wrong ? 

> I believe the issue is in the design of Balsa's locks. The convention 
> is to hold the gdk lock when asking for the mutt lock, and it was the 
> imap-scanning code's failure to honor this convention that I was 
> originally fixing. However, on the face of it, it would make more sense 
> to give up the gdk lock, which largely protects UI objects, when asking 
> for the mutt lock, which protects back-end objects. In general, it 
> seems to me to be good to hold as few locks as necessary; the question 
> is, how few really are necessary?

we don't *need* to hold the gdk lock when we're doing backend stuff. we
need to hold the gdk lock when we're doing UI stuff in a thread.
that's why we drop the gdk_lock around imap_open and got into this mess to
start with :)

> As time permits, I might take a look at reversing that convention: 
> having all code give up the gdk lock before grabbing the mutt lock. The 
> most obvious issue is making sure all code honors the new convention; 
> the more critical issue is making the code robust to changes in gdk/gtk 
> objects made while the gdk lock is dropped.

ok, that might work.

now, about balsa2 and GtkTreeStore sucking, i talked to one of the
guys working on gtk+2.0 and he confirmed TreeStore sucks
in the way you can't move stuff around. the 2 solutions he presented would
be to import TreeStore into balsa2 code and tinker with it's internals so 
that we can move stuff around. other way would be to ask kris ( @ximian 
i think) for the stuff he's working on for gtk+2.2 which is suposed to
have this stuff.


Carlos Morgado - chbm(at)chbm(dot)nu - -- gpgkey: 0x1FC57F0A FP:0A27 35D3 C448 3641 0573 6876 2A37 4BB2 1FC5 7F0A
Q: Why do computers have a reset button ?
A: Windows

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]