Re: [BuildStream] Protect against plugin modifications of artifacts



Hey all,

This is a lot of mails to respond inline, so I will rather answer as a single
email. Please excuse the length of it.

TLDR
====

I think we should be closing the API to prevent writing in the sandbox, but I
think we need other construct beforehand, and to rethink how impacted plugins
actually should work in BuildStream.
A potential way forward I can see is having a 'artifact' source kind, more
in details in the mail.


Scope
=====

First of all, in order to ensure I correctly understand the premises of this
thread:

The aim is to more specifically remove the `Directory.open_files()` call for
writing files in the directory. Is that accurate? Or is there more things that
would disappear?


What would it affect?
=====================

A quick look around the plugins shows that the following plugins would be
affected by such a change:

bst_plugins_experimental/elements/bazelize.py
bst_plugins_experimental/elements/collect_integration.py
bst_plugins_experimental/elements/collect_manifest.py
bst_plugins_experimental/elements/dpkg_build.py
bst_plugins_experimental/elements/dpkg_deploy.py
bst_plugins_experimental/elements/flatpak_image.py
bst_plugins_experimental/elements/oci.py
bst_plugins_experimental/elements/tar_element.py
bst_plugins_containers/elements/docker_container.py

So this seems to be more than "a few" plugins and most of those seem to have
a thing in common: their dependencies are actually part of their output,
in a transformed way. I'll come back to that later.


What should we trust in the sandbox and plugins?
================================================

It is mentionned in this thread that we do not trust python's APIs to run
repeatably for plugins. However we do rely on it for BuildStream itself, and
there is little we can do around that area of trust.

I do agree that we should do our best to prevent plugin authors from shooting
themselves in the foot and to help them achieve the most repeatable build
possible. However I don't think there is a difference between trusting Python
for us, and trusting python for the plugins.


What way forward?
=================

I agree that having plugins being able to write any files in a sandbox is not
the best and prevents us (BuildStream core) from making any assumption.

There seems to be a need for not executing code in the sandbox but rather
outside of it though. Based on the plugins above, I will try to go deeper and
explain what I see as common problems they have and alternative ways to allow
them to continue doing their work.

All the plugins above have a common point: they treat their dependencies as
their input, as other elements would treat their sources. As a brief summary,
here is the list of operations they are doing:

- Archiving / checksuming files and writing that in the sandbox
- Writing a manifest of some kind

Having them run script inside the sandbox means they would have a harder time
knowing what is their input and what is there to allow doing their job, and we
currently don't have any construct in BuildStream for such things.

This seems to show to me that there is a need for an `artifact` source kind,
which would be the output of a previous element. I will start another
thread to keep things separate as to how this would work, but I believe this
would solve the first part of our problem, that is, be able to separate build
dependencies from "sources".

The second part would then be, for plugins that are 'packaging', like docker or
oci, to rewrite the element to work on the `artifact` source instead. And for
other elements, like `bazelize`, to actually be a `SourceTransform` based on
the previous elements.

A brief example of how I could see that working (More in the next ML thread
I'll start after this, let's not discuss the details here too much unless
relevent)

For an element, we could have as sources:

```
kind: oci_image

sources:
- kind: artifact
  elements: my_stack_for_which_to_create_a_oci_image
- kind: manifesto  # Write a manifesto in the sources based on previous sources

build-depends:
- tar (or buildah)
```

I believe that such a new construct would allow BuildStream plugins to achieve
all they achieve today without writing to the sandbox directly for elements.

I also think that we should potentially try this first before closing the API
alltogether on a few plugins.


I hope that helps and might be a way forwards.
Cheers,

Ben



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]