Re: Package manager integration with BuildStream



On Mon, 2018-04-30 at 12:21 +0000, Sander Striker wrote:
Hi,

On Mon, Apr 30, 2018 at 1:39 PM Chandan Singh <chandan chandansingh net> wrote:
Hi,

I generally like the SourceTransform approach and don't have any better
alternatives to offer at this point. But, as Sander asked to find holes in this
approach, I can play the devil's advocate for a short while.

:)
 
Now, I don't like the above approach much either, but in the case of
cargo/rust, it is a bit better because it doesn't require that anyone go
installing the rust toolchain on their host just because one package
out of 500 happens to use rust - eventually, when cargo and rustc
become more commonly available on distros, a source plugin to do this
legwork would be better.
You raise an interesting point there; the need of [compatible] host tools
to deal with certain element types.
My current understanding of SourceTransform approach is that it will not
eliminate the need to have element/language-specific toolchain on the host. Is that correct?

That is correct.
 
And if so, maybe we also need to think about how to minimize the
number of host tools needed on a developer's host system.

(This is also a problem with most of the other approaches mentioned so far,
except maybe for the bst file generator approach.)

This is not a complete solution but do we need to have some sort of "staging
sandbox" without network restrictions so that we can move the requirement of
having the tools from the host to inside the sandbox?

That's pretty much what I asked as well; but I feel comfortable making that
a separate problem.  The design for an implementation is not going to be
trivial I'm afraid (while the mechanics likely are).
Yes, it would certainly be nice to be able to do this with a runtime,
but it's logistically weird; i.e. what runtime is it ? does buildstream
have to provide one ? that would be undesirable I think...

Does the user need to provide a runtime containing the host tools
required for obtaining sources and data to inject into the pipeline ?

Sounds like an interesting option to provide, also it would have to
apply to all sources unilaterally (i.e. there is nothing different here
about a SourceTransform, it's just as much of a problem that we require
host git for git sources, or host ostree for ostree sources).

But still, we should definitely think about this separately.

* SourceTransform.track()

   This requires that previous Sources are not only tracked, but that
   they are also *fetched* for the tracked version, so we know that
   all previous sources are available to stage.
This is going to be a bit awkward. As far as I can tell, all source types as of
now can be tracked independently. Introducing the requirement of also fetching
the previous sources would mean that it will not be possible to bulk-track all
sources without also fetching them. In a large organization where there is a need
for frequently tracking the elements (probably in an automated fashion), this
is going to increase the time and space required to do that.

I think that is overstating it.  The requirement is that any Sources that are
preceding a SourceTransform in a sequence need to be fetched before
tracking of that SourceTransform can be executed.
In elements that contain no SourceTransform there is no difference.
Right.

A SourceTransform has an additional directory in context, which
contains the result of all previous sources.
    kind: rust
    sources:
    - kind: tar
      url: downloads:thispackage.tar.xz
    - kind: cargo
In this example, you haven't specified explicitly specified the additional
directory you mentioned above so I'm assuming that you don't expect it to be a
part of the bst file but to come from whatever instantiates the sources. If
that is correct then what happens when the first source (tar in this case)
specifies a destination directory using the `directory` attribute? How do we
communicate that destination directory to the second source (cargo in this
case) ?  Extending this problem, what happens if there are multiple sources
with different destination directories; how do we communicate those to the
final SourceTransform source?

That is an interesting problem that I hadn't considered.  This seems to imply
that a SourceTransform needs a source directory...    I have a feeling this is more of an edge case than 
the common case.
This is rather awkwardly three different questions in the same block.

  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  In this example, you haven't specified explicitly specified the
  additional directory you mentioned above so I'm assuming that you
  don't expect it to be a part of the bst file but to come from
  whatever instantiates the sources"
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I have two presumptions here:

  * There is a sane default which depends entirely on the type of
    SourceTransform.

    For example with cargo/rust... it will make sense to default
    to a ./vendor subdirectory, and additionally generate a config
    file which informs the rust toolchain to search for crates in
    the ./vendor subdirectory instead of contacting the internet.

  * When there is a weird package, which does not "play well" with
    the sane default, a SourceTransform is like any other Source, it
    is also allowed to introduce it's own YAML configuration to work
    around packages which don't conform to a norm.

   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   If that is correct then what happens when the first source (tar in
   this case) specifies a destination directory using the `directory`
   attribute?
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I think is rather a non-issue.

Rather think of it this way:

  * If you are staging a rust build tarball (or git or whatever) into
    a subdir named foo/, then you should stage your cargo
    SourceTransform element also in foo/.

  * If you did not match the directories when you declared your
    sources, the reason for things "not working" should be rather
    reasonably evident to you, since you did go ahead and explicitly
    put stuff in a subdir.

I could see this kind of situation arising where, for instance, you
have a multi-language build in some repository... where the root level
is something normal, and it contains some extensions in rust or python
in a specific subdir.

In that type of case, you would stage the whole repo at the root of the
build area as usual, but you would have to use the 'directory'
attribute for staging your SourceTransform - there are a couple of
plausible approaches here:

  o Either we say that the meaning of the 'directory' configuration
    for SourceTransforms is the CWD for it to "do it's thing"

  o Or, we have additional configuration for it to "do it's thing"

   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   Extending this problem, what happens if there are multiple sources
   with different destination directories; how do we communicate those
   to the final SourceTransform source?
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is pretty much a domain specific problem, which can be solved in
ways which are appropriate to the SourceTransform implementation I
think.

Rather, what examples do you have in mind here ?

I think in one scenario:

  * You are staging various different components of the same build:

    So stage them in multiple places, and build the tree, and tell
    you SourceTransform what it needs to know about the build tree
    you've constructed

  * You are staging a single build with multiple different
    subdirectories with different languages, maybe you have a repo
    with Go, Rust and Python:

    So stage 3 different SourceTransform elements after it, and tell
    each one of them separately where to go and do their thing.


Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]