Re: [BuildStream] [Proposal] Plugin fragmentation / Treating Plugins as Sources



Hi,

Horrible Python hacks are one of my favourite subjects :) .

TL;DR/Summary:

Its possible to override Python's import mechanisms to achieve this
(multiple versions of a library used in one process). A working (in
Python 2.7, but trivially portable to 3.x) example of this is [0].

I don't have much knowledge about PluginBase so I don't know if it
provides similar utility without forcing plugin authors to use the same
hack, or even how possible it would be to extend it to do so.

I think it is worth the effort of at least looking into it; whilst the
proposed `bst plugin fetch` idea solves one part of the problem solved
by pip, I think forcing plugin authors to be limited to the standard
library is too restrictive.


On Sun, 2019-04-14 at 13:53 +0900, Tristan Van Berkom via buildstream-
list wrote:
<snip>
We recommended that for bst-external, people just didnt do it
(potentially because we misguided them by providing the `pip` origin
at
all in the first place).

I don't remember seeing a strong recommendation to use submodules when
I was first experimenting with BuildStream a few months ago, though its
entirely possible I didn't read the docs well enough. Even so, I don't
really like having to set up submodules in my project just to provide
plugins from an upstream source (the plugins might not be stable or
marked as "blessed" by being in the main repo but they are still part
of the BuildStream GitLab group and look upstream to me).

<snip>
  A) If we ensure that there is only `bst-plugins-good` plus whatever
     local plugins a project uses, then we don't have this problem.

     But then there is not much point in moving the plugins outside
     of BuildStream at all.

     I am really fine with this, but due to the (completely
irrational)
     desire I've seen around wanting to strip out the important
ostree
     plugin from the core even in advance of other plugins, I worry
     that trying to maintain the core plugins in a single repo is
only
     going to cause internal friction and arguments about what should
     be included or not.

I agree with this, I'm not sure of the point of just moving all the
plugins into a plugins repo.

  B) If we fragment plugins maximally, as stated above, we need
     BuildStream to automate the operation of obtaining them.

+1. However, I'm interested to know what you see as the benefits of
maximal fragmentation, rather than having repositories/plugin
"libraries" which provide groups of related plugins?

<snip>
 This means
that when one runs `pip install buildstream-external`, they don't
have to
install `requests` manually. I think forcing people to do that will
be an
inferior user experience compared to what we have currently.

I recognize this, but we have never required that plugin repositories
behave like this.

A more relevant reply is that external python library dependencies
are
clearly the edge case, and it is the distro packaging which takes
care
of most of the burden of obtaining the majority of dependencies for a
collection like bst-external (so installing it on debian might
automatically install git, bzr, ostree etc).

I.e. since we know for sure that source plugins will always have
*some*
need for host dependencies, it doesnt make sense to treat the python
installer as if it were something that solves the problem of
installing
dependencies - we have Plugin.preflight() because we know there will
be
cases where a plugin file is installed but not yet ready to function.

I understand that the plugins are currently expected to indicate
missing dependencies rather than blindly assume they're present, but in
my experience of getting started with BuildStream the part where I
needed to install a bunch of tools to make plugins actually work was
one of the most frustrating.

I think its worth minimising this as much as possible, and its
certainly possible to automate installation of Python dependencies. In
general, I think "you need to install the following host tools for this
plugin to use" is more expected than "you need to install the following
host tools and also this set of Python dependencies".

<snip>
If we split up the current Core plugins into separate repositories,
I'd have
expected that each plugin will declare its own dependencies in its
setup.py.

Most dependencies cannot be installed via python anyway, requiring a
setup.py and pythonic package structure for all plugins just for the
sake of a few outliers seems to be pretty overkill.

Is there really much pain associated with writing a setup.py and
structuring things as a Python package? Especially if splitting plugins
into sets of related plugins rather than 1 plugin == 1 repo.

<snip>
If we go with my venv proposal, we will not have to care about API
stability.
If we install each plugin into its own virtual environment, it can
express
fully resolved version of its dependencies, and does not have to
care about
what other plugins (or any other host package) need.

I don't think your venv proposal is complete.

Honestly, from your writeup you did not seem to be proposing
something
where each and every plugin only exists within it's own isolated venv
and namespace.

I even wrote in my last mail that I did consider this but thought it
was completely overkill, and the obstacles to making that work (if
even
possible at all) are high.

Can you please explain exactly how we can have all of our plugins
living in isolated venvs and still callable in the same interpretor
process ?

You can override the default import mechanism to allow this, by using
`imp` or similar to pre-create a module for each version of a library
(with a versioned name) and then automatically rewriting import names
to make them return the correct version of the imported module.

An example implementation of this is [0]. That is broken in Python 3,
but can be fixed with a simple s/__builtin__/builtins/.

This is a pretty horrible hack and also will probably add overhead for
plugin authors by requiring they use it in their plugins (unless the
plugin mechanisms in BuildStream provide a place that such magic could
be implemented somehow).

<snip>
  * We dont bless a VCS if we use venvs

    No, but we *do* end up blessing PyPI and python packages,
    this is *at least* as bad as blessing a single VCS
    (and I think the whole world is basically onboard with git
    at this stage really).

I think it is really ironic you say that when one of the Core
source plugins
is `bzr`. If someone is using `bzr` to manage their source code,
I'd say there
is a good chance that they may want to use it for storing their
BuildStream
plugins as well.

We cannot dictate to anyone how to manage their source code,
especially
when most of the modules someone wants to build with BuildStream is
not
even managed in their own VCS but obtained from a third party.

We can certainly dictate that if you want to maintain a BuildStream
plugin via the `git` origin, that you must host it in a git
repository.

This is true, but that is effectively dictating to someone how they
should manage their source code if they want to produce a plugin that
can actually be easily used by BuildStream users.

<snip>
I think blessing PyPI is less bad as that is that standard way to
distribute
Python packages, which is what BuildStream plugins are at the end
of the day.

That is again false.

A BuildStream plugin is a *file*.

By python's definitions, a package can be one of two things:

  * A directory of python files which contain an __init__.py

  * Something which can be distributed via PyPI, complete with
    the package metadata bells and whistles

BuildStream doesnt require either of the above: A BuildStream plugin
is
a python *file* which BuildStream carefully loads into an isolated
namespace (such that multiple, differing plugins of the same name can
exist in the same interpretor).

A Python package is a directory containing Python modules. It is
entirely possible to distribute a single Python module on PyPI. A good
example is the bottle[1] framework. I don't think that adding a small
setup.py alongside a single-file plugin is much overhead really.

<snip>
    We also impose that plugin repositories do some python
    package voodoo when really, BuildStream plugins only really
    need to be a file (you admit to this in your writeup as
well).

We do not force anyone to use Git to version control their plugins,
we do not
even require that plugins be in version control at all. However, we
do require
that they be written in Python. Since we already require that the
plugins to be
written in Python, I don't think it is completely unreasonable to
expect plugin
authors to also write a 10-line setup.py, which is the defacto
python packaging
system.

Python is a language choice, it is by no means at all a choice to
participate or buy into pythons packaging and distribution
frameworks.

If we can avoid requiring that setup.py, since we already treat
plugins
as simple python files (not packages), then I think we provide added
value to plugin authors.

This might make it easier to throw together a plugin quickly, but I
don't think that its really all that arduous to add a small setup.py.
We could even provide a template repository, or write a small tool to
autogenerate the supporting things (we could also use cookiecutter[2]
or similar here).

[0]: https://github.com/mitsuhiko/multiversion
[1]: https://github.com/bottlepy/bottle
[2]: https://cookiecutter.readthedocs.io/en/latest/readme.html

Thanks,

Adam



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]