Re: [BuildStream] Coping with partial artifacts



Hi Sander,

Let's try this again, I was very on edge two weeks ago for reasons
unrelated to the project, apologies for being a bit snappy at the time.

On Fri, 2018-11-02 at 11:57 +0000, Sander Striker wrote:
Hi,

On Fri, Nov 2, 2018 at 8:40 AM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:
On Thu, 2018-11-01 at 22:25 +0000, Sander Striker via BuildStream-list
wrote:
Hi,
 
I expect CAS and ArtifactCache (and ActionCache) to contain partial
artifacts at some point due to different retention policies for
different parts.  For example, build trees might expire much faster
than the core artifact (the part that is staged when depended upon). 
Similarly logs may expire as well.
 
This is all policy based, and very much use case specific.  I do not
wish for us to design or implement how partial expiry of artifacts
might work, I merely want us to acknowledge that artifacts may become
partial in the future.  And with that acknowledgement consider future
proofing buildstream to gracefully deal with partial artifacts.

Since we have agreed to implement configurability of uploads of the
build trees[0], yes.

However for the rest, log files, etc; I think not - this is not a
generic thing and I consider it very unwise to go down a path where
arbitrary parts of an artifact might be missing.

It wouldn't be arbitrary.  We would need to agree on what these pieces
are.  The policy on what to retain and what not will be different in different
contexts.  I don't see a need for BuildStream to provide this functionality in
its bundled cache, but external cache providers should be allowed to do this
to properly manage available resources.

BuildStream needs to know that an artifact exists or not, and rely on
this to make the assumption that all of the components of the artifact
exist.

Why?  I hope you're not saying that I need to have the logs of an artifact I
depend on to be able to build?  At build time, I would argue that we don't
want to pull down _any_ information over the network that we don't actually
need.  Nor want to store in our local caches (which people are already
aggressively wanting to clean).
And this being the case, why couldn't BuildStream then fail friendly when
I run `bst artifact log mine.bst` ?
 
Implementing configurability of the build trees on it's own is going to
already impose considerable complexity, we should not allow this to
grow any further.

I am probably missing something, can you elaborate on the
considerable complexity?  Because I see the complexity coming in
regardless for different reasons (as outlined above).

I will clarify that I feel quite defensive against partial artifacts,
including (or even especially) the build trees not being able to be
assumed with the presence of an artifact.

We are headed down a slippery slope complexity wise, and it is of very
high importance that we don't end up playing a game of jenga inside
`Element._update_state()`, which is already a focal point of confusion
which desperately needs fixing in my opinion - partial artifacts makes
this worse.

Asides from this part, we risk littering the codebase with complex
state machines and branch statements, when it comes to how to decide
behaviors of various functionalities under the duress of missing parts
of an artifact (this is more true of the build trees than it would be
of your example, the logs).

In this instance I will reinterpret your email as a recognition of this
danger on your part, instead of a simple insistence that we should have
more uncertainty about which parts of an artifact are present in a
local artifact cache.

In other words, in order to avoid scattered complexity which results
from this, I think we need to think out very well how we can introduce
an abstraction layer into the codebase to handle these checks and
failure modes in order to avoid spreading this comprehension thinly
across the codebase.

I also think we really need to fix Element._update_state() so that
normal human beings are able to modify it and enhance it safely.

In closing:

  * Yes I can see that from an abstract point of view, this
    functionality would be desirable in some cases

  * We also need to be able to deliver complete artifacts and
    have a very clear story about how to deliver these in
    production pipelines, or have the user decide what a production
    build pipeline wants to be mandatory in their artifact cache
    servers.

  * I think we agree that this is a dangerous activity in the present
    state of the codebase, and I think that is the motivation of your
    email which I misinterpreted.

    We need to design some abstraction to handle failure modes for
    partial artifacts, possibly at the ArtifactCache level where
    components might ask the ArtifactCache for a specific component
    and the program is required to fail in a specific way, instead
    of having branch statements bubbling all the way up to `cli.py`.

    Further, I think we need to refactor `Element._update_state()`
    as a requirement to handling partial artifacts gracefully, this
    probably involves separating the code into seperate classes of
    use case (is the element workspaced ? are we calculating state
    for strict cache keys or weak ones ? etc) in order to make this
    code more straight forward.

I hope this is a better, more understanding reply.

Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]