[BuildStream] Profiling before the gathering - PLEASE PARTICIPATE

From: Daniel Silverstone <daniel silverstone codethink co uk>
To: buildstream-list gnome org
Subject: [BuildStream] Profiling before the gathering - PLEASE PARTICIPATE
Date: Fri, 18 Jan 2019 18:03:50 +0000


Hi everyone.

I intend to discuss profiling and optimisation at the gathering.  In order
to facilitate a data-driven discussion, I have prepared a little profiling
protocol which I would like to ask as many people as possible to run in
whatever contexts they tend to run BuildStream.

You can check out the code from:

    https://gitlab.com/danielsilverstone-ct/performance-testing

The intention is to characterise the system on which you are running BuildStream,
both from a technical perspective via some scripting, and also from a subjective
perspective by asking you to answer some questions about your system.

Then I will ask you to run a series of commands to gather profiling information
about BuildStream itself.  I then request that you package up (e.g. as a
tarball or a zip file) a text document with your answers, the JSON document
that the characterisation profiler script generates, and the series of binary
profile outputs from running BuildStream.  Then that you send that package to
me directly (rather than to this list).  I will aggregate the results, perform
some analysis, and then present the methodology and analysis at the gathering,
followed by encouraging a profile-guided discussion on optimisations we may
perform.

Running the profiling protocol
==============================

From the top level of a clone of the above repository, please run ./test.sh


This will use the `hyperfine` tool (included in the repository for Linux and
for Darwin) to run a series of basic tests.  These tests depend on the
availability of `find` `xargs` `sha256sum` `python3` and the requisite python
modules capable of running buildstream.  If any of the characterisation tests
fail, please ensure all of those are available.

Please run the characterisation tests with your system as un-loaded as
possible.  If you normally work with many things open (e.g. streaming video,
music, etc) then run a *second* characterisation run in your normal work
scenario too.  If you are running virtualisation layers, please run the
characterisation at whatever layer you would normally invoke the `bst` tool.

Once the characterisation tests complete a JSON document will be written into
the results directory which is created alongside the testing tools.  This
directory is a good place to stage your answers to the following questions:

1. What is the platform you are running on - give as much detail as you feel
   comfortable.  For example, basic hardware details such as RAM, CPU, storage
   medium (less about size, more about nvme vs. ssd vs. hybrid vs. spinning
   rust), OS version if appropriate, etc.  If you're nesting another OS on top
   (for example via a VM, WSL, or Docker) please include details of that too.
2. How do you percieve this system's performance in day-to-day use?
3. How do you perceive this system's performance when running the target code?
4. If you have any non-default configuration for BuildStream please include it
   now.  For example, if you change the default number of builders.

Next we will be running buildstream a number of times, so please be careful
and follow the instructions in sequence.  For the following tests, please keep
your computer running however you normally would when working so that we have
the most indicative possible profile results.

You must have buildstream, from this branch:

    https://gitlab.com/BuildStream/buildstream/tree/jennis/add_new_profile_topic

You should have it installed such that `bst` runs that.

To begin the tests, we assume you're starting from a fresh clone of the
repository (as above) and so there will be no YAML cache in the test-set.  If
you have run these tests before, please ensure you clear any YAML cache and
your BuildStream artifact cache before you begin the tests.

Change into the test-set directory - this is a BuildStream project:

    $ cd test-set

For the duration of the tests we will be profiling two aspects of BuildStream
(the loader and the scheduler).  In the first cases we'll get a loader profile
and in the case of running `build` we'll get the scheduler instead.  We do not
intend to address push and pull at this time, and as such we do not need an
artifact cache available.

Set in your environment the BST_PROFILE environment variable as follows:

    $ export BST_PROFILE=load-selection

First we're going to prime the YAML cache, profiling the cost for doing so:

    $ bst show base-files/base-files.bst

Note, this is NOT cheap to do and will take upwards of several minutes because
in part this is not a trivial pipeline, and in part profiling adds significant
cost to the operations in terms of CPU. (My computer takes around 2 minutes)

A file named something like:

    profile-20190118T173122-load-selection-base-files-base-files-bst.cprofile

will now have been created in the test-set directory, please copy that into
your results folder as '1-no-cache-show.cprofile' and then remove the profiles
from the test-set folder entirely.

Next we're going to run the same command again, but with a hot YAML cache.  In
theory this means that we should go much faster:

    $ bst show base-files/base-files.bst

This should be quicker, it took around 40 seconds for me.  A similarly named
file will have been created as before, please copy that as
'2-with-cache-show.cprofile' and delete the profile files from the test-set
directory.

Next we're going to switch to looking at the scheduler.  First let's change
the profile environment variable:

    $ export BST_PROFILE=scheduler

Let's run a build of that element, this is against an empty artifact cache
if you've followed the above instructions, so it should have to build everything.
The build operations are imports so it should be pretty cheap to do...

    $ bst build base-files/base-files.bst

This will take quite some time.  It took approximately eight minutes on my laptop.

This time the profile file will look more like:

    profile-20190118T174409-scheduler-Fetch_Build.cprofile

Copy this as '3-empty-cache-build.cprofile' and delete the profile files from the
test-set directory.

Finally we want to know the result of attempting a build when everying is cached:

    $ export BST_PROFILE=load-selection
    $ bst show base-files/base-files.bst

This took my laptop around 45 seconds to run.

Copy the profile for load-selections as '4-fully-cached-load.cprofile' and clear
down the profile files and the .bst directory from the test-set.

This is it for the initial testing we'd like to perform.  Please package up all
the information and email it to myself directly.  I'll endeavour to post a summary
to the list once I've prepared everything for the gathering.

Thanks,

Daniel.

-- 
Daniel Silverstone                          https://www.codethink.co.uk/
Solutions Architect               GPG 4096/R Key Id: 3CCE BABE 206C 3B69

Follow-Ups:
- Re: [BuildStream] Profiling before the gathering - PLEASE PARTICIPATE
  - From: Jonathan Maw
- Re: [BuildStream] Profiling before the gathering - PLEASE PARTICIPATE
  - From: Angelos Evripiotis
- Re: [BuildStream] Profiling before the gathering - PLEASE PARTICIPATE
  - From: Daniel Silverstone

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]