Re: GChildWatchSource take three



On Fri, 2003-08-01 at 16:53, Alexis S. L. Carvalho wrote:
> Thus spake Owen Taylor:
> > +void
> > +g_child_watch_source_init (void)
> > ===
> > 
> > I don't think making this public is a good idea; the precondition
> > for using this code is that you aren't using SIGCHLD yourself.
> > If you are, you can remove your handler before calling
> > g_child_watch_source_new(), we don't have to clutter the API
> > for that.
> 
> The thing is:  usually, when you launch a child process, you do
> something like:
> 
> 1) setup signal handler and unblock SIGCHLD
> 2) launch child
> 3) SIGCHLD arrived; call waitpid.
> 
> Without this function being public, you can only set the signal handler
> after launching the child.
> 
> Is this a problem?  Well, if the programmer himself had a signal handler
> installed, it's his problem - let him shoot his foot.
> 
> I'm more worried about what the parent of the process that uses
> g_child_watch can do - this is also the source of my comments about
> signal masks.
> 
> Digging around some more, I found a case where you can make waitpid fail
> with ECHILD while using g_child_watch - sure, this case depends on
> behaviour specified by SuSv3 (but not Posix) regarding setting SIGCHLD

[ Irrelevant note - SuSv3 *is* the current version of POSIX ]

> to SIG_IGN (no zombies are generated), and some unespecified behaviour
> during exec (whether setting SIGCHLD to SIG_IGN survives an exec).
> Nevertheless, it occurs on linux.
> 
> Essentially:  set SIGCHLD to SIG_IGN and launch a program that uses
> g_child_watch.  If the child exits before the (first) call to
> g_child_watch_add, the source is dispatched with a status of -1.
> 
> Proof of concept:  modifying the non-threaded child-test.c like this
> (give the race condition some chance to happen):
> 
>    pid = get_a_child ();
> +  sleep (3);
>    g_child_watch_add (pid, child_watch_callback, NULL);
> 
> and compiling and running the attached program, I get this:
> 
> $ ./child-test 
> Testing non-threaded GChildWatchSource
> child 12434 exited, status 0
> 
> $ ./sigchld_ign child-test
> Testing non-threaded GChildWatchSource
> child 12466 exited, status -1
> 
> Instrumenting check_for_child_exited to print the value of errno, I get
> "No child processes" (i.e. ECHILD).
> 
> Calling g_child_watch_source_init before get_a_child fixes this.
> 
> If you decide to keep this function public, the docs should mention that
> it should be called before the first call to fork (i.e. before the first
> child is launched) - otherwise there's no point in calling it.

Hmm, if we think this is a problem, the API should not call the function
implicitely, but *require* you to call it explicitely. Otherwise, you
are inviting people to write buggy programs that usually work.

But is this a real problem? How many programs are there out there that
run a child process synchronously by forking a child and then 
immediately calling waitpid() on it? How many of them worry that 
SIGCHLD might be ignored?

I think a parent program that leaves SIGCHLD as SIGIGN when forking
a child is just buggy. Even glibc's system() implementation looks 
vulnerable to this.

[ I wrote a short program to test system(), and it seemed to work 
fine, but that may just be a race condition thing. ]

Regards,
						Owen





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]