Re: [BuildStream] Protect against plugin modifications of artifacts
- From: Tristan Van Berkom <tristan vanberkom codethink co uk>
- To: William Salmon <will salmon codethink co uk>
- Cc: dev buildstream apache org, buildstream-list gnome org
- Subject: Re: [BuildStream] Protect against plugin modifications of artifacts
- Date: Tue, 23 Jun 2020 02:28:54 +0900
Hi,
On Mon, 2020-06-22 at 18:01 +0100, William Salmon wrote:
[...]
Definitely agree.
The thing is, both are highly problematic.
This is my issue, creating PLUGIN_GENERATED_CONTENT is some times highly
desirable / necessary*
So we should A) make it as easy as possible to make sure
PLUGIN_GENERATED_CONTENT is unique to a cache key, B) minimize the risk
of PLUGIN_GENERATED_CONTENT not getting correctly translated in to the
sand box.
Writing via the vdir API to the cas is a much better way to get
PLUGIN_GENERATED_CONTENT in to the sandbox, IMO because it dose not have
to be escaped to get through sh and cat,
But this is the whole point, data which ends up in artifacts should be
created within the sandbox, where the input of that data is
deterministic (i.e. the data _and_ tooling used to create that output).
If it is generated with scripts residing in the sandbox as everything
else is generated as usual, then there is no extra escaping to to be
done.
Further, as mentioned in a previous mail here, the shell invocations
also need not be shell, we can run python or JS or anything in the
sandbox as well, or stage scripts from sources to run in the sandbox,
there is a lot of leeway here.
dose not need extra deps and
dose not needs manipulation of configuration commands etc.
This indeed.
You are focusing on the convenience of breaking the promise which
BuildStream makes, this is where I cannot get on board.
Of course you need to stage the tooling which is used to create output,
that is because BuildStream intentionally enforces that you always use
deterministic input, and the exact build/copy of the binaries used to
create output, is a part of the input.
> The fact that generating data in python is problematic is not a
reason to avoid fixing illegal writing to the sandbox (which probably
mostly happens *due* to the latter problem), its a reason to ensure that
we *also* ensure that doesn’t happen.
>
While the addition may have been a error in your eyes, If writing to the
sandbox was not public API I would have be arguing that it should be.
For the reasons above.
It was most definitely an error.
The plugins need to have a way to stage files in locations of their
choosing, before the vdir abstraction was in place, the only way to do
this was by providing a directory argument.
This was abused, and has now lead to generation of artifact data which
is generated non-deterministically, without controlling the inputs of
this output, which is at the heart of BuildStream's promise.
Cheers,
-Tristan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]