[BuildStream] Source cache summary



Hi everyone,

This is a bit of a summary of source cache discussion so far (mainly in
[1], but with some additional discussions) that seems to have settled
down. (Though feel free to comment still!)

Some cache directory refactoring is still in progress at [2], where a
root cache directory can be set which by default contains artifact and
build directories with those options now being deprecated. The CAS
directory and tmp directories are now outside of the artifact directory
and also placed within this root cache directory. Currently I'm still
moving cache quota methods into `CASCache`, as with the CAS directory
outside of the artifact directory these aren't accurate. Furthermore
with the CAS backed source cache we will want the CAS to be reporting
the cache quota rather than just artifact directory.

For source cache, the plan is for the class to be in the context, such
that the `Source` class calls it before fetching sources directly from
upstream. This would mirror the way elements call the artifact cache
before building the artifact. The source cache itself would deal with
staging sources into the local CAS, and pushing them to remote CAS's
where configured and pulling them where available. When building with
sources that are already cached, we don't need to call the sources stage
method because
they are already staged in the source cache, improving build times
because we only need to import them from the local CAS. When
we fetch sources directly we will have the source in sourcedir, which is
still necessary for opening workspaces for some plugins.

The source cache, similar to artifacts cache, can be configured at the
project or user level.

This `SourceCache` class will have these methods to allow sources to
fetch and upload from a remote CAS:
* fetch: given a source will check local CAS for presence and if not,
attempt to fetch from remote source caches when configured.
* stage: When a source isn't in the CAS, it should be staged right after
pulling it via the source plugin.
* push: When having staged the source into the local CAS, it should push
this to remotes that are configured to enable push. Same as artifact cache.

The source cache should contain a reference to the CASCache object and
configured remotes,
the same as artifact cache. As a result both artifact and source cache
should derive from a base cache class to minimize code duplication of
methods pertaining to local and remote CAS.

Very WIP progress of the source cache can be found at [3].

Cheers,
Raoul

[1]
https://mail.gnome.org/archives/buildstream-list/2019-January/msg00031.html
[2] https://gitlab.com/BuildStream/buildstream/merge_requests/1100
[3] https://gitlab.com/BuildStream/buildstream/tree/raoul/440-source-cache


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]