Automated benchmarks of BuildStream



We are keen to make BuildStream as fast as possible. An important
precursor for doing that is to have a way of measuring how well it
performs for the important use cases.

This has been discussed since November last year[1] and we now have
some initial benchmarks that run nightly. You can view the benchmark
pipelines here:

    https://gitlab.com/BuildStream/benchmarks/pipelines

To view the results of the latest benchmark run, click on the menu
button on the far right next to a job (with a cloud + downwards arrow
icon) and select 'Download benchmarks artifacts'. This gets you a .zip
file containing a `results.json` file and these are the "raw"
benchmarking results.

Dominic Brown is currently working on post-processing the results so
that you get a nicely rendered table instead of an unordered JSON dump
of the data. For now, you just have to read the JSON but it's pretty
simple:

 * the 'versions' section lists the different versions of BuildStream
   that were tested against each other.

 * each block in the 'tests' section has "name" and "results" fields;
   the "results" sections have a block per version that was tested;
   and within there you see the actual measurements from each test run.

The tests run on a dedicated laptop here in the Codethink office
(pictured here[3]). More runners would be welcome, these need to be
dedicated bare metal machines of some sort to ensure we get meaningful
numbers.

Currently we only run two tests. One is called "Startup time" and just
runs `bst --help`; a nice touch is you can see the improvements we got
from fixing issue #172 in the results. The other test is called "Build
of Baserock stage1-binutils for x86_64", and runs a very simple build
pipeline. (This test uses the Baserock definitions project[2] as a
testcase because that project is still buildable with BuildStream
1.0.0).

The test runner is a simple Python program that lives here:

    https://gitlab.com/BuildStream/benchmarks/

It's intended to be usable from the commandline as well as from CI, see
the README.md for instructions on how to run it.

The tests are defined in that repo too, in this file:


https://gitlab.com/BuildStream/benchmarks/blob/master/bst_benchmarks/default.benchmark

The format is also partially documented in the benchmarks.git README.
Currently the tests are pretty simplistic. We aim to make the results
much more granular, as sketched out in this spreadsheet[4]. To do this
we need to start parsing BuildStream's log output and Jim MacArthur will
be looking into that.

Another step is to have a project generation script to which we can say
"give me a project with 100,000 elements" and such things, so we can
for example test real scalability limitations. Work on this is in
progress at Bloomberg.

All that work is tracked under
<https://gitlab.com/BuildStream/buildstream/issues/205> so subscribe
there for updates. Future work should be tracked as issues against
the 'benchmarks' project.

So that is the current state of our performance improvements. Once the
benchmarking tool is up to scratch, we can get back to the performance
roadmap[5] and do some optimizing :-)

Sam

[1].
https://mail.gnome.org/archives/buildstream-list/2017-November/msg00001.html

[2]. https://gitlab.com/baserock/definitions/

[3].
https://wiki.gnome.org/Projects/BuildStream/Infrastructure?action=AttachFile&do=view&target=benchmark-laptop.jpg

[4].
https://docs.google.com/spreadsheets/d/1fDNXX1roADg-GbhpRbvWSla0Y2J_jv9GLU3oz38ATbw/edit#gid=574863445

[5]. https://wiki.gnome.org/Projects/BuildStream/Roadmaps/Performance2018


--
Sam Thursfield, Codethink Ltd.
Office telephone: +44 161 236 5575


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]