Re: BuildStream, Distributed Builds, Bazel Build Farm
- From: Tristan Van Berkom <tristan vanberkom codethink co uk>
- To: Sander Striker <s striker striker nl>, BuildStream <buildstream-list gnome org>
- Subject: Re: BuildStream, Distributed Builds, Bazel Build Farm
- Date: Thu, 18 Jan 2018 16:46:22 +0900
Hi Sander !
This is indeed exciting. I will have to set some time aside to properly
digest these linked papers in advance of FOSDEM.
For a preliminary round, I wanted to share some of my thoughts in
advance, not particularly on the technical details which I have yet to
explore in depth, but mostly an advance taste of the nature of points I
want to keep in consideration throughout this endeavor - in the hope
this can help us be productive and start off on the right foot.
On Wed, 2018-01-17 at 13:06 +0000, Sander Striker wrote:
Hi,
I've been following both BuildStream and Bazel, and am seeing some
potential overlaps where we could look into leveraging the collective
experience and brain power, rather than solve it individually.
I'm interested in distributed builds in BuildStream. Recent
developments around Bazel (http://bazel.build/), specifically Build
Farm (https://github.com/bazelbuild/bazel-buildfarm/) lead me to
believe there are some common patterns.
So, distributed builds has been a recurring and hot topic - and there
are some mixed opinions floating around, I think Emmet is usually in
favor of leveraging something external to do it, while I have been in
favor of (mostly) rolling our own.
My objective is to find the best middle ground which doesn't result in
a needlessly complex system - where complexity here is measured as the
total count of possible points of failure, not as lines of code in the
BuildStream repo. I think you and I already mostly agree on this.
I also like the way you have raised this; there is a clearly defined
goal which we can design a solution for while carefully weighing the
cost against the benefits of various approaches.
At the risk of getting too technical too soon, I'll try to lay out some
of my initial thoughts I've had so far, so we have a head start - but
before that I should outline a very simplistic and abstract draft for
context:
BuildStream already has an artifact cache server and a scheduler.
The least complex machinery I can think of is to just allow the
BuildStream CLI to act both as a master and as a slave; where slaves
can be run remotely with permission to access the same artifact
cache. The master need only assert that dependencies of a given
element be satisfied in the remote cache atomically, before
dispatching a job to an available slave.
Slave implementation is as simple as running `bst` to build the
desired element, seeing as dependencies are cached, no time is
wasted downloading sources that are unneeded. The logging needs
to forward messages back to the master in order to aggregate a
nice master session log.
Individual logs for failed builds is more tricky, *but* we
already have issue #76 on the roadmap; "Cache Failed Builds",
which when implemented, then makes this detail simple again
anyway.
Seeing as scheduling builds on remote machines is staggeringly similar
to what we already do with local processes, you can understand my
reluctance to make the whole process doubly complex by using any off
the shelf turn key solution (something I think Emmet has been a
proponent of thus far).
That said, other solutions which externalize the problem of distributed
building entirely can also be interesting as a way to setup barriers
against feature creep - we should still keep both avenues in mind.
Now for some sample thoughts to chew on, in advance of a more thorough
conversation:
Remote execution API
~~~~~~~~~~~~~~~~~~~~
Here is something that we clearly lack in the above picture,
so this *seems* to be a clear win should this be a suitable
external dependency (see further below for an elaboration on
this).
In general, I am optimistic that the software you propose will
probably check these boxes nicely, and feel that it's desirable
to use something external for this.
Content Addressable Storage
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here we already have an artifact cache and server - by all rational
logic I can think of, I would avoid adding yet another one unless
it were to either satisfy some real use case - or if it runs well
on multiple platforms, potentially allowing us to replace the
multiple artifact caches we have with a single one.
Artifact storage facility is an entirely private detail of
BuildStream and can have it's implementation swapped without issue.
Of course similar concerns regarding suitability of dependency
need to be examined as with anything external.
There may be some benefit to adopting this part, but it should
be demonstrated and in this case it should probably supersede
something that we already have.
This could be either of the following, but ideally all three:
o Local ostree cache for Linux.
o Local tarball cache for Non-Linux fallback platform.
o Artifact cache remote server. Needless to say, if this is
a remote service thing, it should still have a compatible license
such that anyone can build it and install it on their own
hardware.
Dependency Suitability
~~~~~~~~~~~~~~~~~~~~~~
In the above I referred vaguely to dependency suitability, to clarify,
some aspects I would consider to weigh various degrees of suitability
include:
o LGPLv2 Compatible License.
o Has reasonably small set of dependencies itself; contributing to
overall repeatability of your setup on a modern distro in 10
years time.
o Has a very stable API, being a responsible citizen towards
it's downstream consumers.
o Is an isolated software which does one thing well.
o Requires no additional setup by the user (BuildStream can
configure it using API, but the user need only install and
configure one thing).
Cheers,
-Tristan
[
Date Prev][
Date Next] [
Thread Prev][Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]