Re: [Evolution-hackers] Camel Manifesto



Hi Matthew,

On Fri, 2009-11-20 at 10:46 -0500, Matthew Barnes wrote:
> There may be isolated cases internally to Camel where it can exploit
> parallelism in CPU-intensive tasks with threading or where threads are
> necessary for interacting with synchronous-only libraries, but it should
> be used sparingly and hidden behind a fully asynchronous API.

	So - I'm well up for hiding complexity behind an asynchronous API in
general; that's a great goal. I guess there is also the mail-to-e-d-s
red herring to consider in the mix - that (potentially) adds a layer of
asynchronicity to the equation in the form of remote dbus calls; perhaps
worth considering that in parallel - though it would be cut at a
different place (potentially).

>   It should not be central to the design of the entire mail application,
> as it is currently.  Basically I want the mail front-end in Evolution 3
> to be single threaded, or as close to that as possible.

	Sounds reasonable.

> The first is what I think is a very insightful paper on the inherent
> problems with threads:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

	Sure - I read it carefully many moons hence; and it's good - though
IMHO it over-states the case somewhat, or at least - it seems to me that
sometimes the alternative is worse.

	The problem IMHO comes when there is a multi-step process; eg.

	DN lookup
	socket connect
	N way ssh handshake 

	none of which the application cares about - really; the 'async' way is
to whack this all as atomised pieces of code, into some state machine:

	switch (state) {
	case DNS_CONNECTING:
		if (not failed) {
			state = DNS_CONNECTED;
			pollstate = IO_OUT|ERR;
		}
	case DNS_CONNECTED:
		send (lookup_msg);
		pollstate = IO_IN|ERR;
		state = DNS_WAIT_REPLY;
	case DNS_WAIT_REPLY:
		read () ...
		if (!short_read)
			state = SOCKET_NEW
		else
			continue in this state
	case SOCKET_NEW:
		...
	case SOCKET_CONNECTED:
		pollstate = IO_OUT|ERR;
	case SOCKET_CAN_WRITE:
		...
		pollstate = IO_IN|ERR;
	case SOCKET_WAIT_RESULT:
		...
	}

	etc. etc. etc. This is basically what ORBit2 / linc does - although, of
course we get lazy when eg. Windows demands more round-trips, and we
probably do DNS synchronously (that stuff never worked well anyway), and
so on. It is not particularly awful - though, some tricks such as
checking for 'IO_IN' before HUP etc. to avoid loosing the end of a
message are worth not forgetting ;-)

	Of course - as the number of the steps in the handshake grows the scope
for error and confusion grow - nevermind the debugging problem: when it
locks up, what went wrong ? :-) how do you even see the state of the
umpteen state machines that are ticking away behind the scenes ?

	Of course - some large 'state' structure is required - replicating an
equivalent thread's stack (but on the heap), and that has to be
lifecycle managed and so on in a similar way to threads I guess - with
some extra function overhead.

	The threaded version with async callback API I guess has the same
initial closure creation overhead; but then the code is fairly easy to
read:

	host_addr = do_blocking_lookup (name);
	if (cancelled || !host_addr)
		goto emit_error;
	fd = socket();
	connect (host_addr);
	if (cancelled || !connect_error)
		goto emit_error;
	write_request (fd);
	read_reply (fd);
	emit_success_callback () etc.

	it is also rather easy to debug - as soon as anything fails - with
'bug-buddy' or other conventional debugging tools it is easy to see who
was causing the blocking / dead-locking, or what synchronous calls were
not responsive.

	It is also far easier and clearer to re-use code via blocking calls (I
suspect) - than to cobble other sets of states into your state machine.

	And of course, none of this is news to anyone I'm sure. Clearly though
- the simpler the locking, and the closer to clean & simple message
passing - the easier and safer the threading becomes.

	Anyhow - to me at least ( thankfully shielded from the pain that is
suffered by users of camel ) it seems like a large influx of re-writing
everything as async is unlikely to give substantial reliability wins
(beyond those intrinsic to having a great hacker re-read, and test the
code).

	But - since I'm not doing it, I can only write long & silly mails to
try to persuade :-)

	ATB,

		Michael.

-- 
 michael meeks novell com  <><, Pseudo Engineer, itinerant idiot



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]