Re: [Evolution-hackers] RH 7.3 filter / threading bug

From: Not Zed <notzed ximian com>
To: Michael Meeks <michael ximian com>
Cc: evolution <evolution-hackers ximian com>, JP Rosevear <jpr ximian com>, Dan Winship <danw ximian com>
Subject: Re: [Evolution-hackers] RH 7.3 filter / threading bug
Date: Fri, 28 Nov 2003 10:26:54 +1100

	And of course - since we closed socket 27 the worker thread never gets
woken up to do its stuff / send us a signal to wake us up :-) so we hang
forever.

	So - the obvious fix (which we tested and works fine) is to do:

@@ -695,7 +695,7 @@
 		if (maxfd > 0) {
 			for (fd = 0; fd < maxfd; fd++) {
 				if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd !=
STDERR_FILENO)
-					close (fd);
+					fcntl (fd, F_SETFD, FD_CLOEXEC);
 			}
 		}
 		
	Which makes life good. My feeling is this bug should persist on systems
without the new futex stuff which bins the worker thread ie. RH 8 too
(but perhaps by some fluke we escape it).

Cool. I didn't think the thread stuff used sockets, but i never looked into it too closely. That code looked like 'standard' fork/exec code to me, but maybe its a hangover from much older unix days (i.e. sunos4).

	I believe - that it's a trendy thing to bash-glib, so I had a dig in
there, and would you believe it g_spawn does:

The api's look a bit weird though. So nothing new for glib there.

      open_max = sysconf (_SC_OPEN_MAX);
      for (i = 3; i < open_max; i++)
        set_cloexec (i);

	ie. when we switch to g_spawn there won't be a problem ;-)

	A quick grep for _SC_OPEN_MAX reveals that the same problem code is cut
and pasted [ URGH - why ? ] in HEAD and stable in a frightening number

I really hate pointing the finger like this, but, blame Jeff on this one I think :)

of places; at first glance every one of them does it wrong:

./mail/em-junk-filter.c:108
./mail/mail-config.c:1426
./camel/camel-filter-driver.c:694
./camel/camel-filter-search.c:532
./camel/camel-stream-process.c:206
./camel/camel-process.c:81
./camel/camel-gpg-context.c:583

	It would be good to fix those.

I want(ed) to get everything going through a common abstraction but never got around to it. The cut and paste is just an example of poor code re-use (and not exclusive to this function :(.

	In addition, examining the code [ and being no expert on how the camel
stream stuff works ] it seems that in camel-filter-driver.c
(pipe_to_system) we do effectively:

	... fork-helper ...
	... synchronous blocking write of whole file ...
	... flush ...
	... synchronous blocking read of reply ...

	This approach is (naturally) doomed to failure for anything that
returns more than a socket-buffer's worth of data, and takes more than a
socket buffer's worth - since it will block on output while we're still
trying to write to it: is that a bug ? [ I appreciate perhaps for
several cases this is ok - since we're just looking for a status report
or single line answer but ... ].

Yes I know this, I brought this up with Jeff earlier once I started looking into the code. I think the only reason this actually works is that all filters anybody has used so far read the entire message before outputting anything.

We used to have another option which didn't read any output too (>/dev/null), but for some reason that got disabled out of the filter system; i'm not sure why Jeff did this since the behaviours are distinct.

	I ask, since Mark's test case was to use 'cat' as the filter script -
and you can envisage cases that produce lots of random output I suppose.
Has that aspect also been cut / pasted ?

	Anyhow, it was an 'interesting' bug to have to untangle (with Dan's
help) :-) the thread helper thing really sucks; and I guess none of use
have RH 7.3 systems handy to test on these days.

But this would mean that they were always having this issue, not just with > 1.4.5 as they claimed :)

Z

References:
- [Evolution-hackers] RH 7.3 filter / threading bug
  - From: Michael Meeks

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]