Re: PostSession script support fades out from 2.4 to 2.6?



On Tue, Aug 09, 2005 at 03:29:45PM -0500, Brian Cameron wrote:
> 
> Jerry:
> 
> It is a known bug that the PostSession script doesn't get run properly
> on exit in some situations.
> 
> Refer to bugzilla bug 152906:
> 
>   http://bugzilla.gnome.org/show_bug.cgi?id=152906
> 
> I believe in the situation you describe, the gdm_slave_xioerror_handler
> will get called to process the signal and this should notice that the
> session was started and call term_session_stop_and_quit, which will call
> gdm_slave_quick_exit.
> 
> I suspect this might not be working due to the setjmp/longjmp logic
> because calling longjmp will return the state of the program to when
> setjmp was called, so the state of the global variables may get lost
> causing gdm to "forget" it has a running session and causing the
> PostSession to not get called.  You can refer to the bug report
> mentioned above for more information.

longjmp don't change heap (and thus not global variables) only the stack.

xioerror and signals are a problem since longjmp is the only way to do work
outside the context of a signal handler.  You MUST longjmp if you want to
call certain system calls to avoid hangs / memory corruption etc...  Of
course longjmp brings it's own problems.  You simply cannot call session_stop
from xioerror_handler since that might be inside a signal handler.  Xlib
sucks this way.

It must be a problem of logic, not with the longjmp.  It would be interesting
to find out what the state of the globals is when stop_session is called
especially it would be interesting why when we setjmp with
JMP_SESSION_STOP_AND_QUIT, what is the state of variables mentioned in

	/* only if we're not hanging in session stop and getting a
	   TERM signal again */
	if (in_session_stop == 0 && session_started)
		gdm_slave_session_stop (d->logged_in && login != NULL,
					TRUE /* no_shutdown_check */);

Is slave_session_stop called at all?  If so, is 'd->logged_in' and 'login' in
a wrong state?

It could be that there is some sort of race happening with signal handlers.
These are very hard to catch.

> I would add some gdm_debug() calls to the code and verify that this
> is the problem.  If so, we could rip out the setjmp/longjmp code
> and fix the code so it does the same thing without using jumping.
> I suspect that this will fix the problem.  Could you help with
> this?

You cannot do most things out of signal handlers or out of the xioerror
handler without getting undefined behaviour.  You NEED longjmp to deal with
xioerror because of the way it works.  If you rip out the longjmp stuff you
will bring back even worse problems that happen on things like ctrl-alt-bs.

The correct fix would be to rewrite the slave to be totally event oriented
with a proper mainloop.  This would NOT be trivial and would need to either
introduce threads or rewrite the synchroneous parts of Xlib inside gdm.  Such
a rewrite would (if done properly which is not an easy task) solve all the
race issues with the signal handlers.  The signal handlers would just trip
over a global variable and break the mainloop like they do in the master
daemon.  You would still however need longjmp for xioerror, unless you would
not use Xlib at all.

George

-- 
George <jirka 5z com>
   Originality is undetected plagiarism.
                       -- Dean W. R. Inge



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]