Re: Feature proposal: multiple cache support V2



Sorry for the delayed response here :-)

On 20/10/17 15:13, Sander Striker wrote:
On Mon, Oct 16, 2017 at 4:28 PM Sam Thursfield <
sam thursfield codethink co uk> wrote:

    * make our ssh:// cache URLs usable for pulls as well as pushes


I'll have a look at #112 why ssh:// URLs specifically.

I presume you already did; but in case not: OSTree pushes require two way communication, and need to be behind some kind of authentication so that there is access control. Currently SSH is the simplest solution. There are other options of course, HTTPS comes to mind although the fact that our push protocol is stateful makes that harder than it is for Git.
    * change the way projects and users specify artifact caches, so that
      each entry is just single (canonical) ssh://, http:// or https://
      URL instead of having `pull-url` and `push-url` pairs.

    * allow projects and users to specify multiple artifact caches in a
      list

    * make pipelines pull artifacts from any cache that has a given
      artifact available, in a 'priority' order

    * add `bst pull --cache=URL` and `bst push --cache=URL` option to
      allow pushing to arbitrary caches

    * add a `--cache-timeout` configuration option to control how long
      BuildStream waits for a cache to respond before considering it
      unreachable


Can we detect network being unavailable or do we need to try all cache
entries with a timeout?

I'm not sure how to detect network availability in a platform-agnostic way. With my GNOME hat on I'd ask NetworkManager, but that doesn't sound like the right approach here.

It also doesn't solve the case of partial connectivity, such as restrictive firewalls which may allow access to one cache but silently block connections to others.

Proposed changes
----------------

...

Users and projects will specify caches in the same places as before.
However in each place an ordered list of URLs for different caches
will be allowed

For example, the GNOME SDK project could specify this in their
`project.conf` file:

       artifacts:
         - https://sdk.gnome.org/cache-releases
         - https://sdk.gnome.org/cache-latest

The order is significant. In this case 'cache-latest' has higher
priority than 'cache-releases' as it is listed afterwards. Thus if an
artifact is in 'cache-latest' it will always be pulled from there, not
from 'cache-releases'.


Just wondering if this is the right ordering interpretation or whether we
have it reversed.  If we assume the order to be the order of preference, it
means we should try the first entry, if not present try the second, and so
forth.

Yes, on second thoughts my proposal is a bit backwards :-) Higher in the list should mean higher precedence.

...

There will no longer be a way to remove a cache from the list of
configured caches. BuildStream will try to contact each cache on
startup and any that do not respond within a given timeout will
be considered unreachable. We can't anticipate a timeout value that will
work well in all situations so we will make it configurable through a
`--cache-timeout` option.


Is a --cache="" sufficient to disable fetching from caches?

I wasn't planning on having `bst build --cache...` at all. There are ways it could make sense but i'm not entirely sure what is the best way for that to interact with the multiple cache support. I.E if the use specifies a cache on the commandline, does that replace those from their user config, or replace those from the user config *and* the project config, or is just added to the list of caches that we fetch from .. ?

Sam

--
Sam Thursfield, Codethink Ltd.
Office telephone: +44 161 236 5575


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]