Re: Package manager integration with BuildStream

From: Sander Striker <s striker striker nl>
To: Tristan Van Berkom <tristan vanberkom codethink co uk>
Cc: buildstream-list gnome org
Subject: Re: Package manager integration with BuildStream
Date: Fri, 27 Apr 2018 11:00:59 +0000

Hi Tristan,

On Fri, Apr 27, 2018 at 12:23 PM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:

[...]

First of all, I'll say that I am not interested in any external .bst
file generators, I think they run counter to the design and if they
exist, can exist completely outside of the scope of BuildStream as a
tool, so we need not discuss them here.

Good. Having heard that approach a number of times it just externalizes

too much, and puts an additional burden on anyone working with an

integration that contains elements involving additional steps.

[...]

Now, I don't like the above approach much either, but in the case of
cargo/rust, it is a bit better because it doesnt require that anyone go
installing the rust toolchain on their host just because one package
out of 500 happens to use rust - eventually, when cargo and rustc
become more commonly available on distros, a source plugin to do this
legwork would be better.

You raise an interesting point there; the need of [compatible] host tools

to deal with certain element types.

Your fist proposed solution here is going in the right direction, but I
think can be simplified, also I dont like making sources "treeish" in
the way you did, it would be nice to keep them in a flat list.

+1. I prefer the ordering to be significant over a tree approach. It keeps

things simpler to reason about.

First, let's try to think about some commonality and rules for what we
could handle with useful plugins, and see if this covers the grounds,
also, lets call these "Source package managers" for technical purposes,
they are package managers but are specifically for source code as far
as I can see, not system installed binaries.

* Source package managers are usually able to discover the
dependencies by way of reading the depending source package.

This can be actual source code, or metadata files like Cargo.toml
or python's setup.py.

* These source package managers MUST be able to obtain the required
code and place it in the depending source package's subdirectory at
build time.

This is to say, that as much as cargo would love to put all the
downloaded crates in some system wide or user wide location, we
MUST have a way to beat it into submission, and force it to
download the requirements into a specific location, like ./crates
or ./vendor

Allow me to interpret this as: a package manager that cannot support

this, cannot be supported by BuildStream.

Conceptually +1.

* These source package managers MUST have a technique for identifying
an exact set of sources, such that a "ref" is a constant and there
is a guarantee that you can never, ever get different data for the
same ref in different fetch() sessions.

+1.

* These source package managers MUST never take into account anything
from the host system environment, or at least have configuration
enabling this functionality (i.e. we can NEVER allow Source
implementations to introduce host contamination).

One question is, can we have a sandbox that can be invoked at track/fetch

time, that does have network access, to avoid having to have host tools?

And at the same time isolate the tool from the host.

That is, can we provide package manager specific functionality without

additional host installation?

Let's park that question for now, it's orthogonal.

SourceTransform approach
~~~~~~~~~~~~~~~~~~~~~~~~
Designing a solution for situations which conform to the above points
can potentially be straight forward.

I would suggest that we consider a "SourceTransform" kind of source,
which is also a Source but behaves a little differently.

* A SourceTransform has an additional directory in context, which
contains the result of all previous sources.

* It is an error to ever place a SourceTransform *before* a regular
Source in an element declaration.

+1.

* SourceTransform.track()

This requires that previous Sources are not only tracked, but that
they are also *fetched* for the tracked version, so we know that
all previous sources are available to stage.

Running SourceTransform.track() involves first staging all the
previous sources to a temporary directory, and then running
SourceTransform.track()

The result of SourceTransform.track() is an updated ref, like any
other Source.

Taking rust as the example of choice, the result of it's track()
implementation is a simple python dictionary representation of
a Cargo.lock file.

+1. This is fully in line with how I was thinking about this. It results in

a simple declaration of the element. Allows the tracking of elements

that involve package managers to be managed by BuildStream in the

common workflow. And it keeps what is being tracked transparent.

* SourceTransform.fetch()

The result of the transform's fetch() implementation is that the
transform will download the precisely required versions of all
dependencies according to it's own ref, and cache them as normal
in the source cache.

Unlike SourceTransform.track(), SourceTransform.fetch() does not
require the context of the previous sources.

+1.

* Element.stage_sources()

When it comes time to staging sources, all sources are staged in
order in the regular way, such that:

o The actual element's source code is staged first
o The transform cached result is placed somewhere in the
source's subdirectories where we expect that it will be found
o Additional patches or downloads of auxiliary resources can
still happen at any time here

An example of what the YAML might look like, for a rust package, might
be something like this:

kind: rust
sources:
- kind: tar
url: downloads:thispackage.tar.xz
- kind: cargo

In the above example, we might expect to have a 'rust' element which
would take care of informing the build system that it should be looking
for it's external dependencies, at precisely the location where the
'cargo' SourceTransform placed it in the fully staged build directory,
and there is no extra typing for the user.

Otherwise, we might have an 'autotools' or 'meson' element using this,
in which case it *might* require some prepended configure commands to
ensure that the build system finds the crates at the correct location.

Note however, for the specific case of cargo/rust, there is a
prioritized configuration file, which the 'cargo' SourceTransform
plugin can additionally create at the root of the build directory, so
we could have the 'cargo' plugin at SourceTransform.stage() time, do
the following:

o Create a ./vendor directory containing the crates
o Create a .cargo/config file in the root of the build tree
which informs cargo that it should look for dependencies
in the ./vendor directory.

Of course, plugins can introduce their own specific configuration
options, which can help us to deal with special circumstances and
corner cases, such as rust packages which already have a
.crates/config, or a ./vendor directory, and what to do in those cases.

While this is completely different from your first proposed solution,
it is in the same vein as we prefer automation and fitting into the
BuildStream ecosystem by using track()/fetch()/stage() in the regular
ways.

How do you like SourceTransform() ?

+1.

Any other great ideas that differ from the two presented ideas ?

Thanks Tristan. I am already biased towards this approach. Would

love to hear if others can find holes in this approach, and/or have an

even more elegant proposal.

Cheers,
-Tristan

Cheers,

Sander

_______________________________________________
Buildstream-list mailing list
Buildstream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list

Cheers,

Sander

Follow-Ups:
- Re: Package manager integration with BuildStream
  - From: Chandan Singh

References:
- Package manager integration with BuildStream
  - From: Antoine Wacheux
- Re: Package manager integration with BuildStream
  - From: Tristan Van Berkom

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]