Re: [BuildStream] [Proposal] Plugin fragmentation / Treating Plugins as Sources



Hi Tristan,

On Fri, Apr 12, 2019 at 8:36 AM Tristan Van Berkom
<tristan vanberkom codethink co uk> wrote:


I think this is probably a stronger statement than we need. I'd have
thought it'd be something like "Plugins don't need to be packaged". Or
are you suggesting to remove the `pip` origin for plugins entirely?

I am considering proposing that, but this is very orthogonal from the
proposal at hand.

What I *am* proposing however, is that the upstream plugins which we
maintain in the BuildStream gitlab group as "blessed", be distributed
via the `git` origin. This is mostly because I would much prefer a
scenario where blessed plugins either all live in the same repository
or they live in very fragmented separated repositories, over any ad hoc
middle ground.

I am afraid I do not agree that we should be using Git as the preferred
distribution mechanism. Or, at least, not the only one. This is mainly because
of operational reasons.

First, Git is an inefficient way to transfer such plugins as we do not need the
history at all. Since we just need the content, transferring the `.git`
directory will be completely superfluous. Any kind of archive format will be
be better than Git in terms of performance.

Second, using Git as a distribution mechanism also raise scalability concerns
since Git (and Git servers too) is not designed for that use case. To see an
example of this, one does not have to look further than the CocoaPods disaster
of 2016 [1].

Even if we were to settle on Git as the distribution mechanism, I tihnk the
current proposal for `git` origin is basically reinventing Git submodules. I
think it will be superfluous to add such a `git` origin as everything that it
offers can already be done with Git submodules (or its counterparts in other
VCS farmeworks.) and `local` origin, whereby stores external plugins as
submodules. Won't `bst plugin fetch` end up being the same thing as `git
submodule update`?

<snip>


There are some problems with this though, let me try to sort these out
with some responses to the statements you have made:

  * Plugins have dependencies.

    This is true, but plugins have always from the very beginning
    only been a python file with an optional matching yaml file
    with the same basename.

    It is the responsibility of Plugin.preflight() to report what
    system dependencies are required by the plugin at startup time,
    and the documenting of dependencies we've been doing is mostly
    a formality.

    Note that BuildStream itself does not hard require *any*
    of the dependencies required by the plugins BuildStream
    installs.

I don't think this is an accurate representation of the current situation.
Take bst-external for example, it depends on the `requests` package and
as such, defines it as one of its dependencies in its setup.py [2]. This means
that when one runs `pip install buildstream-external`, they don't have to
install `requests` manually. I think forcing people to do that will be an
inferior user experience compared to what we have currently.

If we split up the current Core plugins into separate repositories, I'd have
expected that each plugin will declare its own dependencies in its setup.py.


  * Plugins have external python library dependencies

    It was naive of me to ever think that it could be safe for
    a BuildStream plugin to import anything outside of the standard
    library, or the buildstream module itself.

    While this can work for a time, it is fragile - I don't think we
    should recommend anything which leaves opportunity for breakage.

    Consider that it has already been challenging for us upstream
    to always maintain a vector of python library dependency versions
    which are guaranteed to work together, some libraries (including
    even ruamel.yaml) need to be restricted or pinned, because
    the python ecosystem is not the shining example of responsible
    actors who care about API stability.

If we go with my venv proposal, we will not have to care about API stability.
If we install each plugin into its own virtual environment, it can express
fully resolved version of its dependencies, and does not have to care about
what other plugins (or any other host package) need.


    Now while this more or less works for an "application" like
    BuildStream, who only needs to care about itself working against a
    precisely selected set of library dependency versions; if we were
    to extend this to say that we encourage plugins to start importing
    external python dependencies - it would be the same as saying
    that multiple python applications with potentially differing
    version requirements on the same libraries need to work within
    the same process/interpretor environment, in other words it
    just very practically cannot work.

    So yes, a *few* (ostree, docker, maybe pip ?) import external
    python libraries, ones which we pretty much deem to be stable,
    but I think promoting this is not viable - in fact we should
    really make it a rule, and have source plugins *only* ever
    call out to host tools.

Why is that? I am not sure I understand why is using a host tool instead of
a library is automatically going to fix the issues with API instability. If
a tool has both a Python API and a CLI tool (like BuildStream itself), and
is unstable, then swapping API for CLI isn't magically going to make it stable.
In fact, using the Python library is often cleaner in my opinion as one does
not have to worry about subprocesses etc.


  * We dont bless a VCS if we use venvs

    No, but we *do* end up blessing PyPI and python packages,
    this is *at least* as bad as blessing a single VCS
    (and I think the whole world is basically onboard with git
    at this stage really).

I think it is really ironic you say that when one of the Core source plugins
is `bzr`. If someone is using `bzr` to manage their source code, I'd say there
is a good chance that they may want to use it for storing their BuildStream
plugins as well.

I think blessing PyPI is less bad as that is that standard way to distribute
Python packages, which is what BuildStream plugins are at the end of the day.

    We also impose that plugin repositories do some python
    package voodoo when really, BuildStream plugins only really
    need to be a file (you admit to this in your writeup as well).

We do not force anyone to use Git to version control their plugins, we do not
even require that plugins be in version control at all. However, we do require
that they be written in Python. Since we already require that the plugins to be
written in Python, I don't think it is completely unreasonable to expect plugin
authors to also write a 10-line setup.py, which is the defacto python packaging
system.


  * We don't have a venv to install into, anyway.

    BuildStream runs where you install it, and if you are going to
    install plugins and their dependencies into a venv, BuildStream
    needs to also be installed in that same venv.

Yes, but I am proposing that we create a virtual environment for our plugins.
BuildStream should manage it as an internal implementation detail. For example,
we can use `pip` to download packages along with their dependencies in a
directory that we control and then use variables like `PYTHONPATH` to ensure
that we can find the plugin correctly. There can be other solutions with a
similar approach as well.

<snip>

From a UX perspective, I think it might be good to have a separate
`bst plugin` command group, similar to `source` and `artifact` groups.
I can imagine `bst plugin list/show` and `bst plugin fetch` commands
also being useful in future.

Certainly agree :)

Great to see that at least agree on this one :)

Cheers,
    -Tristan


Cheers,
Chandan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]