Re: [BuildStream] [Summary] Plugin fragmentation / Treating Plugins as Sources



Hi Mathieu,

On Fri, 2019-04-19 at 00:07 +0200, Mathieu Bridon wrote:
Hi,

On Thu, 2019-04-18 at 20:05 +0900, Tristan Van Berkom via buildstream-
list wrote:
The `git` plugin origin
~~~~~~~~~~~~~~~~~~~~~~~

[… snip …]
  * This proposal downloads unnecessary git history of plugin
    repositories, and adds unnecessary load to upstream git
    repositories which host these plugins.

    I believe this particular point is moot when you consider that
    it is the same for regular source code from git, and that we
    already have SourceCache as a mitigation for this.

In the case of plugins, having the history seems entirely wasteful and
unnecessary. Why not do shallow clones, fetching only the required
version of the plugin?

So this is just as true about fetching git repositories that you
build source code from as it is true for fetching plugins.

When fetching source code from a git repository to build, we don't need
full history, we only really need it for a workspace.

Arguably, it should ideally be possible to track the latest commit on a
branch without downloading the whole history, but sadly that is not
easily possible with git, so we need to download gits in order to track
also.

There is an issue open about this:

    https://gitlab.com/BuildStream/buildstream/issues/261

If this were to be implemented, I would expect to inherit the same
mechanics, but as highlighted before (and also mention in the above
report while it was still in planning stages), the SourceCache should
solve this as well.

I.e. the SourceCache is another usage of CAS (which we use for artifact
caches, so probably can be stored in the same artifact cache), and it
would allow us to download only a given checkout based on the cache key
of a given Source (so it should work equally for all source kinds).

Granted that we have SourceCache, I think none of these are concerns
with a git origin.

However indeed, it would seem to be very tricky to use SourceCache in
conjunction with a venv/pip solution (for the same reasons that
tracking is probably out of the question), as it would require much
code to understand pip's URL formats and derive a cache key which we
know is unique for an exact version, and not something vague like a
branch or tag name.

Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]