Re: Package manager integration with BuildStream



On 2018-06-19 17:31, Gökçen Nurlu wrote:
Hi Jonathan,

Many thanks for your time!

Hi Gokcen,

I've gone through the E-mail conversation now, so my comments should be a bit more well-informed now.

Jonathan Maw via Buildstream-list <buildstream-list gnome org>, 19 Haz
2018 Sal, 13:31 tarihinde şunu yazdı:
I'm also going to avoid picking at specific details.

My understanding of the purpose of SourceTransform plugins is that they
need to be able to access the previous sources so they can (in Gopkg's
case) find the configuration file that lists which packages to download.

So, my questions are:

1. Is SourceTransform too specific? Could the ability to check previous
sources be useful for other sources at some point? If so, we'd be able
to avoid some complexity.

It'd surely give a lot of power to implementors but I honestly don't
know if that would be helpful or not.

In the current implementation, `SourceTransform._track()` and
`SourceTransform_fetch()` don't pollute previously staged files, it
just repeats staging for them in a temp folder. This looks a bit
inefficient, but those sources are already fetched, so.. (This was
taken from
https://mail.gnome.org/archives/buildstream-list/2018-April/msg00036.html)
Technically, it is possible to make this new functionality live in
`source.py`. I imagine, probably with a flag like `is_transform` that
source implementors could set `True`. Then some private methods'
behaviours in `Element` and `Source` would change depending on that
value.

IMHO, SourceTransform's behaviour is significantly different than
Source and keeping them (i.e. what they provide) separate might be
cleaner from plugin developer's perspective.


Yep, I see what you mean.

2. In __ensure_previous_sources() we're fetching and tracking a bunch of
sources. I'm concerned that ought to be done in the pipeline, instead.

SourceTransform's consistency & ref strictly depends on previous
source's files to work. This can be reading a pip.lock, Gopkg.lock,
requirements.txt with versions, yarn.lock(?) etc.. I don't think there
is a nice way to read them without actually retrieving them :( We
might want to keep looking for more light-weight ways, though.

About reusing pipeline for this, I don't know how to approach to that
and definitely open to suggestions. One idea in my mind is that: if
there is a SourceTransform-kind of source in sources list of an
element, maybe `Element` can behave differently then and do this
"ensuring" for us.

It looks like I was misunderstanding what queues do for us when I suggested using the pipeline for those tasks. I previously thought that the scheduler split source fetches into separate tasks in the queue, but the queues only
operate on the Element level.
In addition, the queues are separate for each task, so we wouldn't actually be able to do it that way.

I think restructuring the scheduler to do this would be more work than it's worth.

3. This has probably been discussed earlier, but would it be acceptable
to embed the contents of the Gopkg.toml file into the source's config
field, instead?
    If the files are too big, I believe there was a proposal for
buildstream to be able to include loose files into the yaml.
    I suppose the main thing against sticking the package list in the
project is that you wouldn't be able to track the package list and would
have to include it manually :-/

We want to be able to pick any Source plugin (git,tar etc.) and
combine it with any SourceTransform plugin (dep, pip, cargo..) in an
element. SourceTransform doesn't have to know where files come from.
This way, we also use BuildStream's source caching AFAIK.

When populated after `track`, SourceTransform-kind plugins' REF field
will be probably big and ugly (literally dumping Gopkg.lock,
cargo.lock etc. there). I think plugin-implementors can workaround
that by enforcing a lock file of the corresponding package manager.
IMO, this is good because if we can provide a way to pass previous
sources' `refs` to SourceTransform-kind plugin, it can use them as its
own ref for tracking and caching and maybe we can even avoid
`__ensure_previous_sources` in `SourceTransform.track()`.


Yep, I see what you mean.

I'll have another look over the code to see if anything stands out, and let you know if anything else comes to mind.

Best regards,

Jonathan

--
Jonathan Maw, Software Engineer, Codethink Ltd.
Codethink privacy policy: https://www.codethink.co.uk/privacy.html


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]