Feature proposal: multiple cache support



Hello

This is a proposal to:

  * allow projects and users to specify multiple artifact caches for
    pulling and pushing.

  * make pipelines pull artifacts from any cache that has a
    given artifact available, in a 'priority' order

  * make pipelines push artifacts to the highest 'priority' cache by
    default

  * add a `bst push --cache=$name` option to allow pushing to one
    of the other configured caches

We have a few use cases in mind for this feature.

  * A project that operates multiple caches with different expiry
    algoriths, maybe one cache with artifacts built from release
    branches and another cache for artifacts built from 'master'

  * A user or team that wants to set up a local cache for artifacts
    built from their work-in-progress branches, but wants to still
    be able to use pre-built artifacts from project-wide autobuilds

  * Projects that depend on other projects, where each project has
    its own autobuilder and cache. [Recursive pipelines a.k.a inter-
    project dependencies aren't yet possible with BuildStream, but it's
    been planned for a while and Jürg is going to propose it as a
    feature very soon].

This feature has also been requested here:
https://gitlab.com/BuildStream/buildstream/issues/85

Current implementation
----------------------

Right now BuildStream allows the user to specify this in their
~/.config/buildstream.conf:

    artifacts:
      push-url: xxx
      pull-url: xxx
      push-port: xxx

Projects can choose to override this in the project.conf file.
Users can then override what the project set from
`~/.config/buildstream.conf` by doing:

    projects:
      my-project:
        artifacts:
          push-url: xxx

Overriding is "all or nothing", i.e. if the project specifies a push-url
and I then override pull-url, the push-url will be set to None.

Proposed changes
----------------

Users will specify caches in the same places as before. However the
format will change so that we specify a list of caches, and also each
cache will have a name.

For example, the GNOME SDK project could specify this in their
`project.conf` file:

     artifacts:
       - name: gnome-sdk-releases
         pull-url: https://sdk.gnome.org/cache-releases

       - name: gnome-sdk-latest
         pull-url: https://sdk.gnome.org/cache-latest

The order is significant. In this case the 'latest' cache has higher
priority than the 'releases' cache as it is listed afterwards. Thus if
an artifact is in the 'latest' cache it will always be pulled from
there, not from the 'releases' cache.

We recommend that automated build servers are the only ones with push
access to the cache. On the GNOME SDK build servers we could add a
`~/.config/buildstream.conf` file that overrides the project config:

     project:
       gnome-sdk:
         artifacts:
         - name: gnome-sdk-releases
           pull-url: https://sdk.gnome.org/cache-releases
           push-url: ssh://ostree sdk gnome org/cache-releases

         - name: gnome-sdk-latest
           pull-url: https://sdk.gnome.org/cache-latest
           push-url: ssh://ostree sdk gnome org/cache-latest

In this example, artifacts would be pushed to the 'latest' cache only
by default. They could be manually pushed to the 'releases' cache when
doing release builds by running `bst push --cache=gnome-sdk-releases`.

Note that user config overrides project config *completely*. If I have
an empty 'artifacts' section for a project in my buildstream.conf, that
means "ignore everything from project.conf". This will be inconvenient
for some use cases, but it allows removing caches from the project.conf
that one may not have access to on certain machines.

That's more or less it.
Please let me know your comments!
Sam

--
Sam Thursfield, Codethink Ltd.
Office telephone: +44 161 236 5575


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]