Re: [BuildStream] Proposal: Configuration format for CAS/artifact server split



Hi Tristan,

On Tue, 2019-07-30 at 12:33 +0100, Tristan Daniël Maat via buildstream-list wrote:
Hi,

I've started looking at [supporting separate endpoints for
CAS/artifact services][1]. While the issue seems pretty
straightforward, we'll need to change the configuration format
slightly, and I wanted to check if I'm going in the right direction
since I've not really played with CAS all that much.

Sounds sensible, as I understand it we want to use a shared CAS for
storing artifacts, and as artifacts are more than just the payload we'd
want a way to store the in one place while storing the artifacts in
another.

The configuration format I'd like to use looks like this:

```yaml
artifacts:
   # Un-split caches can still be defined using the old spec
   - url: https://foo.com:11001
     server-cert: foo.crt

   # Split caches are made up of two separate specs like this
   - metadata:   # Can anyone think of more obvious names for users?
       url: https://foo2.com:11001
       server-cert: foo2.crt
     blobs:      # Can anyone think of more obvious names for users?
       url: https://foo3.com:11001
       server-cert: foo3.crt

I think this looks confusing, as in without heavily commenting your
config file, it won't be very obvious to anyone what this stuff means.

Right now `artifacts` is clearly a list of "servers", and from what I
gather from your proposal; it must be beneficial to us if the
relationship of the CAS cache and the artifact metadata cache can be
declared and known (which is why they appear in the same dictionary in
the config I presume).

If it is not important to express that a given artifact server is
related to a given CAS server, then I suggest to make this config more
readable and just add a new "type" field, letting the user decide what
to store in this server (payload, artifact data, or both).

If it is important that the user expresses the relationship between the
two servers, then lets indeed look for better words instead of
'metadata' and 'blobs' (and let's never use the word 'blob' when
addressing an end user).

Perhaps instead of 'blobs':

    payload-server
    cas-server
    data-server
    storage-server
    storage

And instead of 'metadata':

    metadata-server (with 'payload-server' or 'data-server')
    artifact-server (with 'cas-server')
    index-server    (with 'storage-server')
    index           (with 'storage')

Some thoughts I have here are that:

  o Adding the suffix "-server" helps to make the list more clear that
    we are talking about servers here.

  o The idea of 'storage' and 'index' seemed to me to change the
    scenery in such a way that the "-server" suffix isn't needed
    (this is of course entirely subjective).

Cheers,
    -Tristan




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]