Re: [BuildStream] Proposal: Decouple source tracking and building



Hi Tristan,

On Mon, 2019-10-21 at 16:07 +0900, Tristan Van Berkom wrote:
Hi Jürg,

On Oct 21, 2019, at 3:21 PM, Jürg Billeter <j bitron ch> wrote:

[...]

Would it be possible to use git post-receive hooks or triggers to
update the 'refs' in the BuildStream project instead of periodically
tracking all git repos? That would likely be faster overall.

This wouldn't work for external git repositories, of course. However,
in that case you would probably anyway want git mirroring to local
servers and that could again trigger the 'ref' updates.

You could write some scripts which update the YAML yes, you might
also do that for tarballs, and for other source types as needed,
replication much of the value that BuildStream already supplies with
its tracking feature. But do we *want* that to be the user story for
the build solution we’re creating and want everyone to use ?

The git hook could still execute `bst source track foo.bst`. My point
was that if you have many git repositories that you want to track
frequently, triggering repo tracking one at a time using git hooks
might overall be faster with limited server horse power.

You've mentioned that you want to report build failures as fast as
possible, and I expect being able to skip the tracking part of `bst
build` (as tracking would already be covered by the git hooks) would
result in even faster turnarounds than the `bst build --track` approach
as the latter has to talk to every git repository, even those without
changes.

I.e., in this very specific CI situation that you described, the
optimal solution might not require `bst build --track`. And in many
other use cases, tracking and building in two separate pipelines does
not result in a bottleneck. That said, it's definitely possible that
there are use cases where the combined track+build pipeline is crucial.

The tracking feature is very convenient and valuable already and we
should leverage it.

Honestly, I do think this is a valuable feature enough to justify the
complexity it adds.

I'm not sure whether you're referring to the general source tracking
feature or specifically `bst build --track` with the combined pipeline.
I definitely agree that the general tracking feature is very convenient
and nobody is proposing to remove that, as far as I know. The main
question to me is whether the performance benefits of the combined
track+build pipeline are worth it.

We should keep in mind that it's not just about maintenance headache.
State updates also have a performance cost, at least the way they are
implemented right now. I.e., if you're building a project with a large
number of elements, it might even be possible that separate track and
build would be faster than the current `bst build --track` after
simplifying and optimizing state handling in the code.

I suspect that the main motivator to remove this is because it makes
state handling complex. State handling has been a bit of a mess for a
long time and update_state() is maddening for most readers.

I think we’ve learned that in the current architecture; when we teach
BuildStream to do really cool things (like the feature under
discussion), we risk making update_state() worse.

I would much prefer that we have the expectation that complexity is
only going to increase and that we will always make BuildStream do
really cool things, and we should refactor with that expectation so
that complexity doesn’t bottleneck in functions like update_state(),
but instead concerns are neatly separated. Maybe this is unrealistic
?

While it should certainly not be a surprise that the complexity of a
software project typically increases over time, we shouldn't dismiss
proposals for simplifications just because the complexity increase is
normal. If we've added a feature in the past that turns out to be less
useful than expected (and/or conflicts with future features), I think
it's very healthy to reevaluate whether the feature is worth it (we
also always have to consider stability promises, of course).

I don't know whether `bst build --track` is or isn't worth it. However,
I think it's a valid discussion.

And yes, we're working towards the removal of `_update_state()`. This
should help reduce the complexity involved in the combined track+build
pipeline. However, the track+build pipeline support will not suddenly
become completely trivial.

Cheers,
Jürg



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]