Re: [BuildStream] CAS server resource names & instance names



On 10/12/2018 11:41, Jürg Billeter wrote:
On Mon, 2018-12-10 at 09:08 +0000, Daniel Silverstone via BuildStream-list wrote:
On Fri, Dec 07, 2018 at 17:32:37 +0000, Jim MacArthur via BuildStream-list wrote:
So, with the default instance_name of a blank string, we should have a
leading slash on the resource name. If we add this in the client, the server
will reject it at the moment. It's a similar situation for writing blobs.

It looks like the CAS server in BuildGrid already handles both cases. Is
there anything important I'm missing here, or can I go ahead and correct the
resource names in BuildStream?
Slightly earlier in the same proto, there is:

// A single server MAY support multiple instances of the execution system, each
// with their own workers, storage, cache, etc. The exact relationship between
// instances is up to the server. If the server does, then the `instance_name`
// is an identifier, possibly containing multiple path segments, used to
// distinguish between the various instances on the server, in a manner defined
// by the server. For servers which do not support multiple instances, then the
// `instance_name` is the empty path and the leading slash is omitted, so that
// the `resource_name` becomes `uploads/{uuid}/blobs/{hash}/{size}`.

Which I would project to "if `instance_name` is the empty path, the blob URI
does not need to start with a `/`" as well.
This matches my interpretation of the spec. It also matches what the
Bazel client does [1].

I imagine BuildGrid is handling both no leading slash, and a leading slash,
purely as a convenience for how this specification *could* be read both ways.

 From my PoV, I'd probably go for a combination approach and update the CAS server
in BuildStream to support both, but not change the client since the way I read the
spec, the client is correct currently.
I would not change anything, client or server side.

We have to change something on the client side if we're to use CASCache for remote execution, since some remote storage services (e.g. RBE) require an instance name for storage.

I also don't see any mechanism for CASCache to know whether the server supports instances other than by trying both and processing errors, so I'd be strongly in favour of making the artifact cache cache accept instance names in resource names, even if the instance name does not become part of the key and they all map to the same namespace.

The above paragraph Daniel has pointed out has some worrying assertions in it:

a) "the `instance_name` is an identifier, possibly containing multiple path segments"

b) "Anything after the `size` is ignored"

Which I believe makes it impossible to exactly determine if a resource name includes an instance name.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]