Hi John!

El mié, 23-04-2008 a las 21:23 +0100, John Carr escribió:
> Hi All
> Tonight I have good news and bad news...
> has been updated a little,
> included the setup I go through to get a working master and slave from
> scratch. Importantly, the version of the code this deploys needs only
> *one* open port on the master \o/

Good! :)

> My victory is short lived however. After talking to API, I started up
> more slaves and quickly fell victim to various errors from the proxy:

that feeling is familiar to me too... :)

> 2008/04/23 21:07 +0100 [jhbuildbot.master.SocksFactory] Could not
> accept new connection (EMFILE)
> And the web server was also not feeling well:
> 2008/04/23 21:08 +0100 [twisted.web.server.Site] Could not accept new
> connection (EMFILE)
> A quick grep | wc -l of a netstat revealed i have 826 connections in
> the master process. For 2 slaves. I think this is more than my test PC
> is happy with. Because i'm proxying I have twice the number of
> connections (slave to proxy, proxy to master) and there are > 160
> projects... So I think I hit a limit. This will occur with the
> multiplexer and the multi-port solution too, but they should be able
> to cope with twice as many nodes as me (which is only 4 so far).

Right, these are too many connections, and I think it will be worse in
the future, as the gnome moduleset does not stop to grow, including more

And yes, this will affect the multi-port solution too at some moment,
though it will have twice the room you have for connections since it
does not need proxy connections, as you have pointed out.

> If this is true, I think the only way to scale to a decent number of
> nodes is to move entirely to a single connection. Unless buildbot
> upstream is working on stuff that might help us, the way forward might
> be to implement a custom ITransport so that the masters and slaves
> don't open TCP/IP connections directly and instead send their data to
> a shared connection. I have some thoughts on how this would work, and
> its only a slight step away from the current proxying system really.
> Slave side this is (relatively) easy, but i need to investigate more
> on the master side.

I see your point, and think it is worth a try. Also, it would be a idea
to ask directly to buildbot developers and see if they working on
something similar or have some suggestion about the way to go. I'll send
an email to buildbot-devel and ask.

> NOTE that i'm making nasty assumptions about why its exploding. If
> someone with a working multi-port master and some spare pc's could
> quickly see how many slaves we can currently handle it would be a big
> help..

I think we can arrange something here to do those tests, I'll let you
know about the results as soon as I have done it. However, I expect to
get the same results due to the same reason, since the number of
connections will have a limit at some point.


> Playing with the instructions I linked to would also help. It's still
> quite possible the SOCKS code is leaking somewhere...
> John
