[gdm-list] (PATCH) Re: gdm hung after killing child



Hello,

In the spirit of solving my own problems...

I have found that this is because when a gdm slave is killed, it shuts down the session and exits with DISPLAY_ABORT, which means the master drops that display.

The attached patch makes it exit with DISPLAY_REMANAGE for displays of type TYPE_LOCAL. Incidentally, this replicates xdm's behaviour when its slave is killed.

If this is not acceptable (why?) then I have an alternative patch that handles the HUP signal and exits with REMANAGE when it receives one of them.

On a side note, the slave's global "d" variable is initialised after the signal handlers are installed, so there is a race condition if a signal arrives before gdm_slave_run (so if one arrives, then blech). It's very unlikely to happen, but why not just d = display or d = 0 earlier?

Cheers,

 - Simon

On Tue, 20 Sep 2005, Simon Bowden wrote:

Hello,

Running a Debian GDM: 2.6.0.8.

When GDM starts, it launches a child process which in turn launches the X server, resulting in a structure like this:

root     18325     1  0 16:57 ?        00:00:00 gdm
root     18326 18325  0 16:57 ?        00:00:00 gdm
root 18586 18326 0 17:08 ? 00:00:03 /usr/X11R6/bin/X :0 -audit 0 -auth /var/lib/gdm/:0.Xauth

Now, we have something that due to historical reasons sends a SIGTERM to the parent of the X server (18326 above). That process dies, takes the X server with it (which is intended), BUT, then nothing starts up again.

The 18325 process just sits there indefinitely. No X server is running now. If I SIGHUP the process then it wakes up, but that's manual intervention.

# strace -f -p 18325
Process 18325 attached - interrupt to quit
poll(
[never returns in many minutes waited]

Obviously one could argue that we should just kill the parent of it, but I really don't think gdm should react like that anyway.

Some further debug info:
It logs this:
Sep 20 18:19:53 gdm[18325]: gdm_child_action: Aborting display :0

The strace does this just before the poll:
send(7, "<30>Sep 20 18:19:53 gdm[18325]: gdm_child_action: Aborting display :0", 69, 0) = 69
rt_sigaction(SIGPIPE, {SIG_IGN}, NULL, 8) = 0
time(NULL)                              = 1127204393
close(8)                                = 0
gettimeofday({1127204393, 22553}, {4294966696, 0}) = 0
waitpid(-1, 0xbfb601f8, WNOHANG)        = -1 ECHILD (No child processes)
poll( <unfinished ...>


Cheers,

- Simon
--- daemon/slave.c.orig	2005-10-25 13:55:56.000000000 +1000
+++ daemon/slave.c	2005-10-25 13:49:55.000000000 +1000
@@ -4359,7 +4359,11 @@
 
 	gdm_debug ("gdm_slave_term_handler: %s got TERM/INT signal", d->name);
 
-	exit_code_to_use = DISPLAY_ABORT;
+        if (d && d->type == TYPE_LOCAL)
+            exit_code_to_use = DISPLAY_REMANAGE;
+        else
+            exit_code_to_use = DISPLAY_ABORT;
+
 	need_to_quit_after_session_stop = TRUE;
 
 	if (already_in_slave_start_jmp ||


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]