Re: [BuildStream] Proposal: Add support for running tests in BuildStream



Hi again Chandan,

This is a lot to think about, I took the time to try to break down
these requirements and brainstorm a bit further...

On Thu, 2018-07-19 at 20:29 -0400, Chandan Singh wrote:
Hi Tristan,

Couple of clarifications before we dig deeper:

Reducing the number of .bst files was one of the motivations but not the
primary goal. The main issue I see with having tests as a separate element is
that if they fail, the build for the main element will still be reported as
successful. This is problematic because then if we want to ensure that the
tests are passing before generating a deployable artifact from that element, we
have to find the corresponding test element. As there is no definitive way of
finding that element, we will have to rely on naming conventions, which is not
ideal.

To mitigate this, what we currently do is to run the tests as part of the build
process. This has the obvious downside of making the build times worse,
especially if the tests are more complicated than simple unit tests. Trying to
make the tests run in parallel to the builds of the reverse dependencies is my
primary goal.


I'm not too attached to my implementation proposal and would be happy with any
other way of implementing it. But in the end I would like to be able to run
tests in a way that:

- the build for the element fails if the tests fail
- ideally, they don't block builds of its reverse-dependencies

If we can achieve the above with some yaml magic, I would be happy to treat
that as a solution to this problem.


Coming back to your last message.

  o That an element now has different dependencies depending on what
    it's doing... does this mean even more cache keys ?

Probably yes, if we go with my current proposal. I suppose it would introduce
a new form of "unsafe" cache keys. I haven't fully figured out the
implementation just yet but here's how I imagine it would work at a high-level:

- once the build finishes, an artifact a stored with unsafe cache key
- at this point, elements depending on it can use this artifact but their
  resulting artifacts will also be marked unsafe
- when trying to find a dependency in cache, elements should give less priority
  to the artifacts corresponding to the unsafe cache key
- if the tests pass, the proper cache key will also point to the same
  artifact and then it can be pushed
- if the tests fail, the artifacts are scheduled for deletion

Another point that this ignores is that tests are not necessarily
relevant to every kind of element, but are certainly relevant to the
build elements.

Fair point. I agree I was mostly thinking about BuildElements. Special-casing
them would not be ideal if we can avoid it.

Very interested in hearing any suggestions on how to tackle this problem.

Ok so there are a few main points I can glean from this, I'll try to
condense them here:

 a.) We don't want tests to block reverse dependency builds.

 b.) We want to ensure that tests are passing before deploying
     something (I think in your case it's a .deb, for other use cases
     it can be another packaging format, a full system firmware, etc).

 c.) We want to structure things such that we are sure which backing
     element is associated to a failing test, this should be clearer
     and stronger than a naming convention.

 d.) We want to minimize the amount of .bst files and the amount of
     YAML which needs to be maintained.

 e.) Tests often require a different execution environment and set
     of dependencies

     E.g. many tests found in the modules which make up a typical linux
     system require external services to run, like an X server or a
     database.

In addition to your points, I would add:

 f.) We want to be able to run many kinds of tests, e.g. some tests
     are provided by upstream module maintainers to test a specific
     module, other tests may involve launching a VM and capturing
     screenshots.

 g.) With caching of build trees on the horizon, we probably want some
     semantic allow reuse of that build tree for testing purposes.

Let's skip over (a), I think all solutions aside from running tests
inside the BuildElement as a workaround would cater to (a).

Focusing on (b) and (c), I would imagine that a natural way to
structure a package deploying project would have a pipeline structured
similar to this (with (B)uild, (T)est, (P)ackage and (S)tack elements):

     B(foo)     (additional test deps)
       |  \       /
       |    \    /
       |    T(foo)
       |    /
       |  /
     P(foo)
       |
       |
  S(main target)

It appears that satisfying (b) is very easy, we just need to make
packaging of "foo" contingent not only on building of "foo", but also
on testing of "foo".

Addressing (c) I think is a matter of having a strong relationship
between the test and what element that test is testing (although at
this point there are many ways of looking at the problem).

To brainstorm this, I would suggest that the potential test Element
plugin have the following properties:

   o Tests a specific element
   o Can add dependencies for the sake of testing, these must be build
     only dependencies (so they are not propagated forward).
   o Can optionally stage the build tree of the built element
     - needed for `make check` style tests, not necessary for other
       integration style tests
     - satisfy (g) as an optimization (no need to rebuild just to run
       make check)
     - alternatively without (g), needs to stage the sources of the
       element it's testing
   o The output of the test element is exactly the same output
     as the build element itself (similar to a filter element without
     the filtering, acts as a "pass through" but with a check/test)

With a test Element designed like this, the dependency graphs can be
simplified and I think we manage to address (c) by ensuring that
semantically, the test Element "tests a specific element".

     B(foo)  (additional test deps)
       |     /
       |   /
     T(foo)
       |
       |
     P(foo)
       |
       |
  S(main target)

Orthogonal BuildElements which depend on "foo", can depend directly on
B(foo), while deployments of "foo" should depend on T(foo), a failure
of T(foo) is always clearly a failure related to B(foo), there is no
room for doubt.

Unless I'm missing something, this only leave the problem of (d), which
can at least be thought of as a completely orthogonal problem. Solving
(d) is of high value on it's own I think as it should let us express
complex pipelines more easily.

As an example of what could be possible with a little YAML-foo, we
should be able to express B(foo), T(foo) and P(foo) in the same foo.bst
file, and it would probably make sense that if a multi element .bst
file was referred to without a subscript, it should always mean the
last element. We could even have a grouping mode which implies a series
of elements which depend on eachother, if that makes sense.

Such a foo.bst file might look like this:

   ===============================================================
   # Specify the grouping type, lets say `sequence` means that
   # the elements depend on eachother in a sequence
   sequence

   --
   kind: autotools
   # depend on the successful *build* of bar
   depends:
   - bar.bst[0]

   --
   kind: test
   # Build depend on additional deps
   depends:
   - filename: base/mysql.bst
     type: build

   config:
     # Specify the element to test, probably in such a grouping
     # there is a way to specify this without saying "foo.bst",
     # perhaps self[0] or such could work.
     #
     test-element: foo.bst[0]
     commands:
     - make check

   --
   # In the deploy stage we probably dont need much configuration,
   # this is driven mostly by public data from the depended upon
   # elements, and if the test element is a passthrough, we already
   # implicitly depend on it due to being part of a "sequence".
   #
   kind: deb_deploy
   ===============================================================

If it is the case that a lot of the configuration in these files become
redundant (i.e. only the "name" of something changes, lots of boiler
plate), then it would be interesting to employ something like macros,
turning large parts of boiler plate into one liners.

In use cases where you do not deploy a "package", a later "compose" and
firmware creating element would still depend on the last element of the
groups.

In the unlikely but possible case where you want to `bst checkout` the
individual build results of each BuildElement as a part of your
production pipeline, then doing `bst checkout foo.bst` would still be a
checkout of the BuildElement content that is contingent on the tests
having passed.


Of course this is not entirely specced out, but do you see anything
with the general direction which fails to satisfy your use cases, or
the wider general use cases we should be considering for the tool ?

Best Regards,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]