Re: [BuildStream] Allowing duplicate junctions [Was: Be explicit when overriding junction configuration, or else warn/error]

From: Benjamin Schubert <contact benschubert me>
To: Tristan Van Berkom <tristan vanberkom codethink co uk>
Cc: Jürg Billeter <j bitron ch>, Chandan Singh <chandan chandansingh net>, "buildstream-list gnome org" <buildstream-list gnome org>
Subject: Re: [BuildStream] Allowing duplicate junctions [Was: Be explicit when overriding junction configuration, or else warn/error]
Date: Wed, 03 Jun 2020 08:38:28 +0000

Hey Tristan,

I must say I find this thread hard to understand. I'm going to try to
summarize it, both to ensure I understood everything and potentially help
others follow.

TLDR: I like the idea of explicitely having junctions users have to configure
it instead of having to have knowledge of some specific use cases.
I however am not 100% sure about the descibed format.

I also have a few comments inline.

So here is my understanding:

Problem: We want to allow projects to depend on another project multiple times,
and accross junctions.

Why it can't be done today: We ensure that the `name` of the project is unique
in the pipeline, and thus, we won't be able to have multiple times the same
project.

There are two ways we want to be able to do this:

- Allowing a parent project to tell a junction to use another element as the
junction, for example, if we build a project over gnome-build-meta, and we
also have a freedesktop-sdk junction, we might wand gnome-build-meta to use
our exact junction of the project, not theirs.

- Allowing a project to rename the project names of its junction, or their
junctions, thus removing the name clashes and stating "this is a different
project". This would be useful, for example, for "build-only" junction
dependencies.

For this, we want to introduce two new constructs in a `junction` configuration:

- `overrides`, which takes a mapping of element -> element, which replaces every
reference to the `key` element by the `value` element, thus solving problem 1.

- `project-names`, which takes a mapping of name -> name, which replaces the name
of the `key` by the `value`, in the context of the junction, thus solving
problem 2.

Now my personal opinion:

I find it weird that for `overrides`, we use elements as keys, and for names, we use
names. I would have expected us to either always use names, or elements.
I think we should use one and only one of them.

- Names would be more flexible, as we would require less knownledge about the junctioned
project (where does it store the junction?).

- Elements are more rigid but less error prone and completely explicit.

I also am not sure how this would work with deeply nested junctions:

- If we want to override, would we specify something like:

```
kind: junction

overrides:
"myjunction.bst:subjunction.bst": my_own_junction.bst
```

Or how would we construct this?

- Same for names, if a junction has two junctions on the same project, and renames one of
them. How would I rename each of them? Would I be targeting the name in the junction I'm
overriding?

So if A has a junction on B, which has two junctions on C, namely C and C2, would I do:

```
kind: junction

config:
project-names:
c2: my-own-c
```

In summary, I think we need to either use junction names for keys, or elements, but not
mix them. I personally lean towards elements, as they are more explicit.

Apart from that, I'm happy with the proposal.

More inline.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, 12 May 2020 12:39, Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:

[...]


On Tue, 2020-05-12 at 11:47 +0200, Jürg Billeter wrote:

Hi Tristan,
On Fri, 2020-05-08 at 16:50 +0900, Tristan Van Berkom wrote:

[...]

    Cross architecture bootstrapping
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    When bootstrapping a runtime for a different architecture, it can
    be interesting to use the same toolchain project configured
    multiple times with different project options defining which host
    and target architectures to build libc/gcc under.

    When combining this ability with remote execution, we can
    streamline the process of bootstrapping a system under any
    architecture which we have runners for on the RE cluster.


A possible solution for this use case is to extend the key used for
conflict detection a bit. Instead of only using the project name as
key, we could include the values of (selected) project options as well.
E.g., the target architecture option would be sensible to include, in
my opinion.


I think for one thing, we don't know what a target architecture option
is; we only know that an option is an "architecture" option, but this
could be translated into a host/sandbox requirement, or used in
compiler build instructions to define a target architecture, it it
could be used for something completely orthogonal to the host
requirement or the architecture on which compiled code is expected to
run: I think we don't have the right to know.

Aside from that, I think it's going to be undesirable to stray into the
territory of comparing junction/project configurations, there is no
reason why I should be allowed to configure the same project with
different project options, but not with different source configurations
(different versions).


I do agree there, we should not restrict _how_ we can inherit from junctions,
and only target/arch is not the only way we might want to change that.

Note that we only currently distinguish:

-   This project was instantiated once (possibly the same instance was
    used multiple times by way of "overrides" or by way of the junction
    "target" feature).

-   This project was instantiated multiple times

    We don't recognize equality of projects which are explicitly
    instantiated multiple times, they might accidentally produce exactly
    the same cache keys, but it is no less of an error if they do.

    It would be good to preserve this simplicity I think.

    Auxiliary projects which provide static build-only dependencies
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    When one project depends on another project for some static data
    which will be consumed as build-only dependencies, the data
    from the junctioned project is consumed statically as is, and there
    is no concern of runtime dependencies being propagated forward to
    reverse dependency projects which might consume the same junction.


This also includes statically linked libraries where no runtime data is
required (or runtime data is in a private prefix).
"Isolated junctions" seems like a sensible solution for this use case.


Right, I specifically like this because it abstracts away some of the
problem from any reverse dependency projects, and it is not tied to any
specific use case, it only states:

"I use this subproject internally to produce data which I consume
verbatim, and it is an error if reverse dependency projects end up
with runtime dependencies from this internal subproject"


I can see some value in that, a `private: true` in the `config` section
forbidding runtime dependencies on the junction might be enough?

Or we could go differently, and state that `private: true`, forbids a sub-project
from accessing the junction?

But that might be better in another ML thread.

Cheers,

Ben

Follow-Ups:
- Re: [BuildStream] Allowing duplicate junctions [Was: Be explicit when overriding junction configuration, or else warn/error]
  - From: Tristan Van Berkom

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]