Re: [BuildStream] Buildtrees in 1.4 release

From: Sander Striker <s striker striker nl>
To: Tristan Van Berkom <tristan vanberkom codethink co uk>
Cc: Tom Pollard <tom pollard codethink co uk>, buildstream-list gnome org
Subject: Re: [BuildStream] Buildtrees in 1.4 release
Date: Mon, 10 Dec 2018 13:13:28 +0100

Hi,

On Sun, Dec 9, 2018 at 9:37 AM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:

Hi Sander,

On Fri, 2018-12-07 at 13:31 +0100, Sander Striker wrote:
> On Fri, Dec 7, 2018 at 6:56 AM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:
[...]
> > Hopefully backing out of our rash decision of advancing the release
> > cycle in mid-cycle will relax the situation.
> >
> > I should add, this is not only for the sake of build trees, we planned
> > to make a lot of breaking CLI changes which ideally we will never,
> > ever, ever do again;
>
> Theoretically, yes. When I hear a desire to never release 2.0, I
> have very strong doubts.

I am starting to have my doubts about this too, with the level of how
much the original tweaks to the workspace CLI has spun out of control
and turned into an open season on CLI breakage, it may make sense to
jump to 2.0 instead.

> What is considered to remain stable vs not is selective, as per the
> cache key stability thread. In other words just looking at the
> version number doesn't inform the user fully in terms of
> expectations.

It is not selective. Cache key stability was always known to not have
any guarantees, and with all of the changes going into the artifact
format, providing such a guarantee too early would have caused a lot of
friction (we would have to continue to support operation against every
different artifact format version over time).

For this reason it was very intentional that we did not advertise
artifact cache key stability in our 1.0 announcement:

https://mail.gnome.org/archives/buildstream-list/2018-January/msg00006.html

Sure, but consider the unsuspecting user that only looks at semantic version numbers. 1.x -> 1.x I expect everything to work as before, maybe some bug fixes and some additional features, but no surprises when I upgrade. Which includes expecting bst build to decide on exactly the same things to build.

As such it is selective - we chose to make an exception to the stability rules. And with that taint (weaken) the meaning of a 1.x series.

> > and we should land these mostly simple changes
> > once and for all in 1.4; what is killing us in landing these features
> > is not the functionality itself but the tiresome bike shedding which
> > surrounds them, unfortunately.
>
> Can you be specific in what features you consider bike shedding
> happening? And if you recognize this I am assuming you don't care
> about the color of the shed?

In this instance bike shedding might be an overstatement, a lot of the
functionality which is landing *needs* a lot of consideration about the
color and shape of the bike shed.

The UI/UX is important, it is what the user sees and interacts with. I wouldn't call it a bike shed. If we would, then we can make that argument about a lot of things, ranging from coding style to testing strategies to architecture. It's all a bike shed, right?

Obviously you care too much to think this, so let's chalk this up to our communication styles?

Everything that surrounds `bst artifact` and the potential `bst source`
subgroups, everything about how command line behaviors will be affected
when commands expose new optionalities about build trees, involves very
little in terms of implementation, and a whole lot of consideration
about how the functionality will be presented to the user.

This necessarily slows us down.

+1.

[...]
> > The creation of build trees is so deeply entrenched into the code that
> > I really don't think backing it out is an option, or rather, it is the
> > most expensive option.
>
> I'm not convinced of that at all; it would surprise me if it took
> more than a day to make build trees disappear.

If we yanked build trees, we don't have the ability to debug remotely
failed builds.

This *also* means that we don't have the ability to debug *locally*
failed builds, because the old strategy of littering the user's hard
disk with failed build directories has been elegantly removed in favor
of sharing the same code paths which reproduce the build environment
from an artifact.

Reverting this part might not be as difficult as it seems, but it is
also a bit entangled in how we handle workspaces and mounting of them,
so definitely some care is needed in reverting that sensitive area if
we were to yank them.

Sure, I buy that.

> > As everyone knows I am spiteful of any talk about "experimental
> > flags",
> > lets use it at least as a strawman here: If we were even to consider an
> > experimental flag where everything was "optionally turned on", this is
> > even more technically challenging in my view than adding the missing
> > optionalities that are in mid flight (not only are downloads and
> > uploads, and possibly even creations optional, but the whole thing is
> > optional again - as well as dependent features which should "disappear"
> > when the experimental flag is turned off).
>
> It is exactly the strict guarantees that you wish to give with
> versioning that result in the suggestion of an experimental flag. I
> don't want to be having the conversation with you for the 1.6 cycle
> that making build trees optional is going to be a change in behavior,
> and therefor cannot be done.

We obviously need to just take the time to consider what to ultimately
do with build trees and with testing, and then we don't have to worry
about the burdens of experimental flags.

The strict guarantees of reliable and unchanging interfaces needs to be
there, we have to remember that we are not working on a software from
scratch: we have been feature complete for a core set of use cases
which users are relying on, and we have been for some time.

I believe it is a critical time for the project to focus on
demonstrating reliability to the set of users we have managed to amass,
our user base will grow slowly based on our reliability at this point,
but vanish quickly if we cannot demonstrate reliability.

[...]
> > As such, I think that considering this early commitment as a blocker of
> > the feature, on the sole grounds that a new experimental feature might
> > perform better, is the wrong move.
>
> I'm sorry Tristan, but it goes a bit further than this. Making build
> trees mandatory does make local builds more expensive as well. If
> someone would run the benchmarks I'm pretty sure that we can see that
> (unless IO suddenly became free).

I completely agree with this, and it is *exactly* the point I am trying
to make.

We should not commit to this *only* based on the fact that it might
currently make a new experimental feature suffer, we should base the
severity of having to make this call on local builds as well.

If local builds do not suffer, maybe we can afford to postpone this
commitment, maybe we can solve it in the remote execution case in some
other way.

If we consider testing we should also consider remote execution in the same vain. Because if the parallelism for testing kills the performance of remote execution there is not much point.

> > That is not to say that the optionality of creating/caching build trees
> > is not otherwise beneficial, just that we can postpone committing to
> > this optionality if the only reason to commit to it is for a feature
> > that is still in the process of materializing.
>
> I wish we cared a bit more about performance. And that in the case
> where users do not want to incur the cost of
> creating/storing/pushing/pulling, we shouldn't make them pay.
>
> > > > > In other words, the local cache only needs to be large enough
> > > > > to store the sources, build trees and build results for a single
> > > > > full build - this is the same amount of local disk space required
> > > > > for a full build using say, JHBuild.
> > >
> > >
> > > > > Has anyone even raised an issue or proposed that *creation* of
> > > > > the build trees should be optional ?
> > >
> > > My choice of words may not have been perfect, but I would
> > > > > consider https://gitlab.com/BuildStream/buildstream/issues/566 to be
> > > it. In that we also explicitly state that "Yes we are certainly on
> > > the same page here - I do want to see this used a lot before
> > > introducing an escape route, but agree that we should design one at
> > > some point before freezing things in 1.4." That was three months
> > > ago.
> >
> > Yes, this was always discussed in terms of making the uploads optional,
> > not the creations. If there is something in the comments which speaks
> > of making creations optional as well as uploads, then I'm sorry that it
> > escaped me.
>
> It is literally the first words in the issue: "The build tree should
> be optional in the artifact." And seriously, having to spell out
> "don't create a build tree in an artifact if you are not going to use
> it" seems excessive to get the point across. I think you can safely
> replace "upload" with "create" in #566's summary.

There is a very big difference, if we do not cache the build tree
locally then we do not have any way to debug a build which fails
locally.

When I filed that issue I don't think we even landed the first MR on build trees I think. As the feature developed we should have revisited but alas.

In any case, I think you mistook my tone as an aggressive one; when I
say that I am sorry that this distinction escaped me, I do mean it with
the utmost sincerity.

I probably did. As you did probably take mine. I always try to read every post in the most positive light and with the best of intentions. Unfortunately, I am only human and sometimes fall into the trap of reading things in a negative tone.

[...]
> > We have time to complete this optionality, and this optionality is less
> > expensive than an additional blanket optionality (experimental flag
> > strawman above) or backing it out entirely, I really think pushing
> > forward is the smallest effort of any of the choices.
>
> Repeat three of this statement.

I think that we agree that the best scenario is to figure out how to
make this optional in the right way and commit to it.

+1.

To be frank, it is hard for me to glean from this email what you desire
as an outcome.

Let's just step back and figure out what to do with build trees, to ensure the feature doesn't have a negative impact.

My desire is to move forward without backing anything out, having taken
the time to consider what this means for our future plans first, I
don't want to back anything out and I would *like* very much to have
this optional in the next release - only I want to know exactly how we
are going to tackle this for the testing plan first.

Now that we have bought time by letting go of the release dates, I think we can figure this out. My main objection was releasing the feature with the behavior as-is.

> > That said, all of this talk is secondary to the elephant in the room;
> > which is that very soon the build trees are *going to be mandatory*
> > dependencies for elements which run `make check` on the build trees the
> > necessarily depend on:
> >
> > https://mail.gnome.org/archives/buildstream-list/2018-November/msg00042.html
>
> I still have my doubts that this is the right approach for testing; I
> may have only mentioned this in a private setting currently, mostly
> for opportunistic/time constraint reasons. Creating dependencies on
> build trees from another element I particularly frown upon. It's a
> dependency type we don't even support yet, and I question whether we
> should support it. However this goes into the subject of testing and
> veers away from what we are talking about here.

I think this is the very crux of the issue.

It seems to me extremely obvious, that if we have any chance to
parallelize the builds of reverse dependencies with the `make check`
phase, then we *must* have a way to depend on the build tree of an
artifact from the element which runs `make check` on that build tree.

We should take this back to the other thread. In short if the element that runs make check is the same element then there is no other element to depend on. This is a different approach to the whole testing, which is why I mentioned that I still have my doubts on the current approach.

This is *exactly* why making a commitment to this optionality is
dangerous and might be better postponed.

And here we disagree. Making things mandatory for one feature should not be detrimental for situations where that feature doesn't even get used.

Before considering a blanket "build trees are optional" statement
entirely, we should consider *how* to make them mandatory for a subset
of cases.

Would it make sense to make the build tree mandatory in the upload of a
failed build artifact ?

If so, it may make sense to do the same for a build tree that is needed
by another element later in the pipeline.

Maybe this can be limited to the same project with a statement such as
"you cannot depend on a build tree of an element across a junction",
making the situation more manageable ?

Or, maybe when depending on a build tree of another element, we make a
statement that the build tree needs to be recreated on demand if it is
not downloadable ?

Let's bring these to a new thread?

> > While postponing optionality is possible if the negative impact is not
> > too bad in local execution (no need to commit while experimental
> > features are still in flux), the conversation of how this commitment
> > impacts the future plans for the testing elements I think is
> > inescapable.
>
> Well... unlike other features testing currently is a discussion, not
> an implemented feature. And nothing experimental here, just
> something non-existent. So while I agree we should not paint
> ourselves into a corner knowingly, I find it awkward to see this
> raised as a blocker.

I am buying us time to avoid painting ourselves into a corner, I don't
see what is awkward about this.

My point was more that testing is still being designed, while remote execution is already concrete.

Having remote execution as "experimental" is essentially the same thing
as not having it at all when it comes to project commitments in the
stable set of APIs and features - if something exists but is
experimental, then the user has no guarantee that it will be unchanging
or reliable.

Well... we know why we consider it experimental, which has to do with API stability. That doesn't mean we should knowingly make it perform poorly and render it useless.

The end goal here is to come to a point where we are satisfied with the
featuresets we want to introduce by the time we commit to them.

The goal is not to make people angry about an optionality which did or
did not land by a given date.

[...]
> > Unfortunately, saying later that "turning off build trees" only means
> > "turning off the build trees which are not required" is very unwieldy
> > in practice; making the distinction of "a required build tree" is
> > almost impossible, it is like the having the past be informed by the
> > future.
>
> I really have no idea what you are trying to say here. In the
> hypothetical case that I want to create an element with a dependency
> on another element its build tree, I will need to turn on the
> creation of build trees for that element. No time warps necessary.

You speak of that as if it were something obvious.

It is [intuitive] from my perspective.

However no thought has ever been put into the idea of giving an element
the ability to force it's build tree to be mandatory explicitly, or
whether that is the best approach to ensure the existence of mandatory
build trees.

Maybe a build tree is only mandatory on the host which is processing
the tests in the reverse dependency test element, and maybe the
elements can be grouped together in such a way that the given build
tree is not necessarily ever uploaded to an artifact server ?

This is exactly the kind of conversation I would like to encourage us
to have, a little bit of thought and design towards what we intend to
achieve.

+1. Let's bring that to a new thread.

[...]
> > If we can answer the question of how to reconcile this plan of build
> > tree optionality, with the plan of explicitly depending on build trees
> > in test elements, then I'm sure we can solve this in time.
>
> I consider this a blocker for any testing proposal. Build trees
> should never be mandatory, a design assuming this is just
> fundamentally flawed.

And I flatly disagree with this.

Sounds like we need to have the discussion.

I don't think this needs to be a point of contention though, the
question is only how to have *only* the build trees which matter to the
reverse dependencies which need them, and to avoid overloading the
whole process just because one element temporarily needs the build tree
provided by another.

Let's have the conversation about whether we need the build tree of another element at all?

Cheers,

Sander

Cheers,
-Tristan

Cheers,

Sander

Follow-Ups:
- Re: [BuildStream] Buildtrees in 1.4 release
  - From: Tristan Van Berkom

References:
- Re: [BuildStream] Buildtrees in 1.4 release
  - From: Tom Pollard
- Re: [BuildStream] Buildtrees in 1.4 release
  - From: Sander Striker
- Re: [BuildStream] Buildtrees in 1.4 release
  - From: Tristan Van Berkom
- Re: [BuildStream] Buildtrees in 1.4 release
  - From: Sander Striker
- Re: [BuildStream] Buildtrees in 1.4 release
  - From: Tristan Van Berkom

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]