Re: [BuildStream] Protect against plugin modifications of artifacts
- From: William Salmon <will salmon codethink co uk>
- To: Tristan Van Berkom <tristan vanberkom codethink co uk>
- Cc: dev buildstream apache org, buildstream-list gnome org
- Subject: Re: [BuildStream] Protect against plugin modifications of artifacts
- Date: Tue, 23 Jun 2020 11:12:38 +0100
On 22/06/2020 18:28, Tristan Van Berkom wrote:
Hi,
On Mon, 2020-06-22 at 18:01 +0100, William Salmon wrote:
[...]
Definitely agree.
The thing is, both are highly problematic.
This is my issue, creating PLUGIN_GENERATED_CONTENT is some times highly
desirable / necessary*
So we should A) make it as easy as possible to make sure
PLUGIN_GENERATED_CONTENT is unique to a cache key, B) minimize the risk
of PLUGIN_GENERATED_CONTENT not getting correctly translated in to the
sand box.
Writing via the vdir API to the cas is a much better way to get
PLUGIN_GENERATED_CONTENT in to the sandbox, IMO because it dose not have
to be escaped to get through sh and cat,
But this is the whole point, data which ends up in artifacts should be
created within the sandbox, where the input of that data is
deterministic (i.e. the data _and_ tooling used to create that output).
If it is generated with scripts residing in the sandbox as everything
else is generated as usual, then there is no extra escaping to to be
done.
Further, as mentioned in a previous mail here, the shell invocations
also need not be shell, we can run python or JS or anything in the
sandbox as well, or stage scripts from sources to run in the sandbox,
there is a lot of leeway here.
dose not need extra deps and
dose not needs manipulation of configuration commands etc.
This indeed.
You are focusing on the convenience of breaking the promise which
BuildStream makes, this is where I cannot get on board.
Yes I am focused on using bst, plugins are a key part of why bst is
usable, but i do also care about it being reproducible and having trust
in the cache keys. Things get in to the sandbox by a variety of ways, by
variable, config etc often as arguments on a programs cli but i really
fail to see the difference in providing something as a config file or as
part of somethings cli.
Of course you need to stage the tooling which is used to create output,
that is because BuildStream intentionally enforces that you always use
deterministic input, and the exact build/copy of the binaries used to
create output, is a part of the input.
> The fact that generating data in python is problematic is not a
reason to avoid fixing illegal writing to the sandbox (which probably
mostly happens *due* to the latter problem), its a reason to ensure that
we *also* ensure that doesn’t happen.
>
While the addition may have been a error in your eyes, If writing to the
sandbox was not public API I would have be arguing that it should be.
For the reasons above.
It was most definitely an error.
The plugins need to have a way to stage files in locations of their
choosing, before the vdir abstraction was in place, the only way to do
this was by providing a directory argument.
This was abused, and has now lead to generation of artifact data which
is generated non-deterministically, without controlling the inputs of
this output, which is at the heart of BuildStream's promise.
I feel you are conflating two things, are plugins `non-deterministic`
and should we let plugins alter whats in the sandbox. The answer to
should plugins be deterministic should clearly be yes and the answer to
should plugins effect what happens in the sandbox seem clearly yes. If
plugins are non deterministic then we need to fix that, plugins effect
the build in a number of ways and if we cant trust plugins then we cant
trust anything about bst as every element is build with a plugin.
Once plugins are trust able, and they must be or the hole concept of
cache keys falls apart. Then your hole argument for why we cant put
things in the sandbox falls apart, given that we must fix plugins so
they are trust able then I fail to see a issue with plugins putting
things in to the sandbox.
Cheers,
-Tristan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]