Re: [BuildStream] [Proposal] Plugin fragmentation / Treating Plugins as Sources



On Thu, 2019-04-11 at 15:53 +0100, Chandan Singh wrote:
Hi Tristan,

Thanks for writing this up. As you say, it's a bit beefy so I haven't
had the time to digest all of it. But, here are a few comments based
on what I have gathered so far.

Hi Chandan,

Thanks for replying so quickly and getting the ball rolling !

On Thu, Apr 11, 2019 at 1:35 PM Tristan Van Berkom via
buildstream-list <buildstream-list gnome org> wrote:

      This proposal attempts to reconcile the above by stating that:
      * Plugins should be split up into domain specific repositories
      * Plugins should not be packaged at all

I think this is probably a stronger statement than we need. I'd have
thought it'd be something like "Plugins don't need to be packaged". Or
are you suggesting to remove the `pip` origin for plugins entirely?

I am considering proposing that, but this is very orthogonal from the
proposal at hand.

What I *am* proposing however, is that the upstream plugins which we
maintain in the BuildStream gitlab group as "blessed", be distributed
via the `git` origin. This is mostly because I would much prefer a
scenario where blessed plugins either all live in the same repository
or they live in very fragmented separated repositories, over any ad hoc
middle ground.


While this is really orthogonal to the proposal, I've written up why I
would really like to drop the `pip` origin altogether below:

I consider the `pip` origin to be dangerous, it has resulted in the
bst-external repository being used via the `pip` origin in advance of
being stable - and any system (or environment) wide installation of a
plugins package is dangerous.

I also posted similar examples in the recent thread about making
BuildStream 2 parallel installable, but let's repost them here
concisely:

  * Consider two projects which a given user needs to work with on
    the same host - both of which depend on bst-external.

    Since bst-external is not stable (until recently, and quite by
    accident), it can (and does) break in backwards incompatible ways.

    This means that you can easily find yourself in a situation
    where you need to upgrade bst-external in order to build one
    of your projects, and as a result; you can no longer build the
    other project.

    In this case, a user would be faced with a pretty bad experience,
    they could wiggle/hack their way out of it by installing separate
    BuildStream installations into separate venvs or dockers, and
    have a different BuildStream to work with each project.

  * Consider the same scenario above, but now link the above projects
    through a junction.

    In this scenario, you can have situations where the junctioning
    project cannot work at all - because it requires two different
    versions of bst-external to be usable in the same environment.


We have always recommended bst-external be used as a 'local' plugin
source and submoduled into your project, which is safe, but people have
not followed that recommendation. I see this as an invitation to
painful bug reports and noise which could have been avoided.

With all of that said, I don't mind having a way to install system wide
plugin collections but the messaging really has to be that these must
be API stable.

Also, I would have preferred that we use a more traditional approach
for packaged plugins, and have the plugin package install the plugins
into a *BuildStream owned* directory - using pkg-config in the regular
ways, as described here:

    https://gitlab.com/BuildStream/buildstream/issues/332

Had we done this from the beginning, we would not need to namespace
plugin collection which need to support both BuildStream 1 and
BuildStream 2, a plugin collection which depends on BuildStream 2 would
simply use the buildstream2.pc file and then transparently install
themselves into the plugins directory owned by BuildStream 2, already
cleanly separated from the plugins directory owned by a potentially
parallel installed BuildStream 1.


      * Plugins are source code, and BuildStream will automatically
        obtain them in the same way that we track and fetch any other
        source code, so that it is a painless experience for the user.

Yes, but that's not quite it. Plugins also have dependencies - both on
host tools and on Python packages. For example, the `git`/`bzr`
plugins depend on the corresponding host tools. And, plugins like
Docker source have dependency on python packages like `requests`.
Treating them as _just_ source code will put the burden of getting
their dependencies on users. Note that this can be somewhat mitigated
if plugins are packaged.
[...]
I like the idea of something like a `git` origin, but I have a
counter-proposal :) When I was thinking about it a while back, I was
actually leaning towards a `venv` origin. I think this might be better
for the following reasons:

  * We don't "bless" a VCS and people are free to use to whatever VCS
they wish to use.
  * Since we'd be installing them as Python packages, at the very
least we will be handling the python dependencies of the plugins, if
not system base dependencies.

I understand that you think it would be neat to have BuildStream
install python dependencies of plugins and that on the surface, this
does seem attractive and convenient.

There are some problems with this though, let me try to sort these out
with some responses to the statements you have made:

  * Plugins have dependencies.

    This is true, but plugins have always from the very beginning
    only been a python file with an optional matching yaml file
    with the same basename.

    It is the responsibility of Plugin.preflight() to report what
    system dependencies are required by the plugin at startup time,
    and the documenting of dependencies we've been doing is mostly
    a formality.

    Note that BuildStream itself does not hard require *any*
    of the dependencies required by the plugins BuildStream
    installs.

  * Plugins have external python library dependencies

    It was naive of me to ever think that it could be safe for
    a BuildStream plugin to import anything outside of the standard
    library, or the buildstream module itself.

    While this can work for a time, it is fragile - I don't think we
    should recommend anything which leaves opportunity for breakage.

    Consider that it has already been challenging for us upstream
    to always maintain a vector of python library dependency versions
    which are guaranteed to work together, some libraries (including
    even ruamel.yaml) need to be restricted or pinned, because
    the python ecosystem is not the shining example of responsible
    actors who care about API stability.

    Now while this more or less works for an "application" like
    BuildStream, who only needs to care about itself working against a
    precisely selected set of library dependency versions; if we were
    to extend this to say that we encourage plugins to start importing
    external python dependencies - it would be the same as saying
    that multiple python applications with potentially differing
    version requirements on the same libraries need to work within
    the same process/interpretor environment, in other words it
    just very practically cannot work.

    So yes, a *few* (ostree, docker, maybe pip ?) import external
    python libraries, ones which we pretty much deem to be stable,
    but I think promoting this is not viable - in fact we should
    really make it a rule, and have source plugins *only* ever
    call out to host tools.

  * We dont bless a VCS if we use venvs

    No, but we *do* end up blessing PyPI and python packages,
    this is *at least* as bad as blessing a single VCS
    (and I think the whole world is basically onboard with git
    at this stage really).

    We also impose that plugin repositories do some python
    package voodoo when really, BuildStream plugins only really
    need to be a file (you admit to this in your writeup as well).

  * We don't have a venv to install into, anyway.

    BuildStream runs where you install it, and if you are going to
    install plugins and their dependencies into a venv, BuildStream
    needs to also be installed in that same venv.

Now, I admit that I did consider a similar `venv` origin before, but
not in the same way that you mean. My idea would be a ton of work,
probably not for much gain in the end, and I'm not even sure that it
would be feasible.

My idea of a `venv` origin is that every version of every BuildStream
plugin from this origin, lives in it's own, isolated venv (i.e. my idea
would have been to "make it actually safe").

As far as I can see, this would require that we never instantiate
plugins in the main BuildStream process, and also that we never assume
that it is okay to have two plugins instatiated in the same process.


In summary, I sympathize with the desire to make something a bit more
automated and convenient, but I think that if we were to try to go that
route:

  * We complicate the plugin story more than we need to, by requiring
    packaging metadata and repository structure, while a BuildStream
    plugin really only is a "file" at the core.

  * We complicate the BuildStream side implementation, we dont
    even own a venv environment where we can run these plugins,
    and we cannot have BuildStream trying to cause side effects
    into the environment where it resides.

  * We would be recommending unsafe practices such as pulling
    in external python dependencies in plugins and hoping that
    it works when trying to run it in a shared environment with
    other plugins.

[...]
From a UX perspective, I think it might be good to have a separate
`bst plugin` command group, similar to `source` and `artifact` groups.
I can imagine `bst plugin list/show` and `bst plugin fetch` commands
also being useful in future.

Certainly agree :)

Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]