Re: Feature proposal: multiple cache support V2



Hi,

On Mon, Oct 16, 2017 at 4:28 PM Sam Thursfield <sam thursfield codethink co uk> wrote:
Hello

Here's a second attempt at an implementation plan for multiple cache
support.

Summary:

   * make our ssh:// cache URLs usable for pulls as well as pushes

I'll have a look at #112 why ssh:// URLs specifically.
 
   * change the way projects and users specify artifact caches, so that
     each entry is just single (canonical) ssh://, http:// or https://
     URL instead of having `pull-url` and `push-url` pairs.

   * allow projects and users to specify multiple artifact caches in a
     list

   * make pipelines pull artifacts from any cache that has a given
     artifact available, in a 'priority' order

   * add `bst pull --cache=URL` and `bst push --cache=URL` option to
     allow pushing to arbitrary caches

   * add a `--cache-timeout` configuration option to control how long
     BuildStream waits for a cache to respond before considering it
     unreachable

Can we detect network being unavailable or do we need to try all cache entries with a timeout?
 

Canonical push/pull URLs issue:
https://gitlab.com/BuildStream/buildstream/issues/112

Multiple caches support issue:
https://gitlab.com/BuildStream/buildstream/issues/85


Proposed changes
----------------

As described in issue #112, we will first modify the artifact push
protocol so that ssh:// cache URLs supply the corresponding http://
and/or https:// pull URLs as part of the `hello` message. This means
that users specify one or the other type of URL for a cache, but do
not need to specify both URLs for a single cache.

Users and projects will specify caches in the same places as before.
However in each place an ordered list of URLs for different caches
will be allowed

For example, the GNOME SDK project could specify this in their
`project.conf` file:

      artifacts:
        - https://sdk.gnome.org/cache-releases
        - https://sdk.gnome.org/cache-latest

The order is significant. In this case 'cache-latest' has higher
priority than 'cache-releases' as it is listed afterwards. Thus if an
artifact is in 'cache-latest' it will always be pulled from there, not
from 'cache-releases'.

Just wondering if this is the right ordering interpretation or whether we have it reversed.  If we assume the order to be the order of preference, it means we should try the first entry, if not present try the second, and so forth.
 
For situations where you want to push build results to the cache, you
would override the project config with the push URL. For example, on the
GNOME SDK build servers we could add this to the
`~/.config/buildstream.conf` file:

      project:
        gnome-sdk:
          artifacts:
            - ssh://ostree sdk gnome org:22200/cache-latest

This would cause artifacts to be pushed and pulled from 'cache-latest'.

To push to 'cache-releases' in this example one would run a command like
`bst push --cache=ssh://ostree sdk gnome org:22200/cache-releases`.

There will no longer be a way to remove a cache from the list of
configured caches. BuildStream will try to contact each cache on
startup and any that do not respond within a given timeout will
be considered unreachable. We can't anticipate a timeout value that will
work well in all situations so we will make it configurable through a
`--cache-timeout` option.

Is a --cache="" sufficient to disable fetching from caches?

Cheers,

Sander

Sam

--
Sam Thursfield, Codethink Ltd.
Office telephone: +44 161 236 5575
_______________________________________________
Buildstream-list mailing list
Buildstream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]