Re: [gtk-vnc-devel] Segfault



On Wed, Oct 03, 2007 at 03:49:26PM -0300, Jonh Wendell wrote:
> Em Qua, 2007-10-03 às 19:29 +0100, Daniel P. Berrange escreveu:
> > On Wed, Oct 03, 2007 at 03:00:23PM -0300, Jonh Wendell wrote:
> > > Hi, folks.
> > > 
> > > I'm getting a segmentation fault in vinagre, related to gtk-vnc:
> > > 
> > > -I connect into 200.100.100.246, type my password, the connection is OK;
> > > -I connect into 200.100.100.191, type my password, and vinagre crashes.
> > 
> > So, you're using 2 separate connections at the same time?
> 
> Yep, vinagre has support to multiple simultaneous connections, via tabs.

So as I mentioned on IRC this turned out to be the result of several factors
conspiring. The ultimate result though is that the 2 coroutines for each
active VNC connection end up yielding to each other forever, never going
back to the main system coroutine. Eventually one of them yields with 
control going back into the GLib main loop with invalid context. SEGV
follows shortly.

It is a tricky situation to understand, so I'll walk through the steps

 1. Start out in system coroutine, S
 2. Connect to first server, transitioning to coroutine A. A's caller is
    set to S
 3. A blocks on I/O, so it yields back to its caller S. S's caller is now
    set to A.
 4. When I/O is ready, S yield explicitly to A. 

Steps 3 & 4 continue ad-inifinitum as data comes in from the server. Notice,
however, step in step 3, we unneccessarily set S's caller to A. This is not
neccessary, because the GLib I/O callback has a direct reference to A, and
will explicitly yield back to A, so caller of S is never used.

Now, consider that we open a second connection..

 1. Start out in system coroutine, S
 2. Connect to second server, transitioning to coroutine B. B's caller is
    set to S
 3. B blocks on I/O, so it yields back to its caller S. S's caller is now
    set to B.
 4. When I/O is ready, S yield explicitly to B.
     ...repeat...
 5. B gets to point where it needs to get a password to authenticate, so 
    we fire auth callback to the application.
 6. Vinagre sets up its auth dialog box and runs gtk_dialog_run()
 7. gtk_dialog_run() spawns a nested main loop
 8. I/O is now ready for coroutine A again, B yields to A, setting A's
    caller to be B.
 9. A blocks on I/O again, so yields back to its caller B. B's caller
    is now set to A.

At this point A's caller is B, and B's caller is A. There is now no way we
can ever yield back to the system co-routine S.

It bounces back & forth but soon goes kaboom.


In considering this I observe:

 - The yield() call swaps to the caller.
 - The yieldto() call swpas to the specified coroutine.
 - The system co-routine must never call 'yield()', since it does not
   have anyone to yield to.

Following on from this:

 - The yield() call must use the caller, and then set it to NULL
 - The yieldto() call must set the caller and not use it.
 - It is incorect to call yieldto() if the destination has a caller
   already set - this indicates re-entrancy
 - It is incorrect to call yield() if you don't have a caller set

So the attach patch makes sure yeild() sets the caller to NULL. It also calls
an immediate abort() if either of the last two conditions occur. This ensures
we get a stack trace at the point the error actually occurs, rather than a
SEGV some random time later.

NB I had to rename system to leader, because when I include stdlib.h the
system variable clashes with the system() fnuction.

With this patch applied I can connect to multiple servers, and even re-enter
the mainloop from within a co-routine without messing up the coroutine caller
chain.

Dan
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 
diff -r 1d5e4d1f22a6 src/coroutine.c
--- a/src/coroutine.c	Wed Oct 03 12:39:50 2007 -0300
+++ b/src/coroutine.c	Wed Oct 03 16:52:31 2007 -0400
@@ -11,6 +11,7 @@
 #include <sys/types.h>
 #include <sys/mman.h>
 #include <stdio.h>
+#include <stdlib.h>
 #include "coroutine.h"
 
 int coroutine_release(struct coroutine *co)
@@ -57,25 +58,23 @@ int coroutine_init(struct coroutine *co)
 }
 
 #if 0
-static __thread struct coroutine system;
+static __thread struct coroutine leader;
 static __thread struct coroutine *current;
 #else
-static struct coroutine system;
+static struct coroutine leader;
 static struct coroutine *current;
 #endif
 
 struct coroutine *coroutine_self(void)
 {
 	if (current == NULL)
-		current = &system;
+		current = &leader;
 	return current;
 }
 
 void *coroutine_swap(struct coroutine *from, struct coroutine *to, void *arg)
 {
 	int ret;
-
-	to->caller = from;
 	to->data = arg;
 	current = to;
 	ret = cc_swap(&from->cc, &to->cc);
@@ -83,7 +82,7 @@ void *coroutine_swap(struct coroutine *f
 		return from->data;
 	else if (ret == 1) {
 		coroutine_release(to);
-		current = &system;
+		current = &leader;
 		to->exited = 1;
 		return to->data;
 	}
@@ -93,12 +92,23 @@ void *coroutine_swap(struct coroutine *f
 
 void *yieldto(struct coroutine *to, void *arg)
 {
+	if (to->caller) {
+		fprintf(stderr, "Co-routine is re-entering itself\n");
+		abort();
+	}
+	to->caller = coroutine_self();
 	return coroutine_swap(coroutine_self(), to, arg);
 }
 
 void *yield(void *arg)
 {
-	return yieldto(coroutine_self()->caller, arg);
+	struct coroutine *to = coroutine_self()->caller;
+	if (!to) {
+		fprintf(stderr, "Co-routine is yielding to no one\n");
+		abort();
+	}
+	coroutine_self()->caller = NULL;
+	return coroutine_swap(coroutine_self(), to, arg);
 }
 /*
  * Local variables:


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]