Re: Managing buildstream repositories, variants, platforms, ...





On Mon, Aug 21, 2017 at 8:19 PM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:
On Mon, 2017-08-21 at 14:07 +0000, Sander Striker wrote:
> Hi,
>
> I'd like to start a discussion on how to best manage hundreds if not
> thousands of projects, with a need for building on multiple platforms
> (think: OS, compiler toolchain, architecture, ...).  Note that
> subsets of the projects may only need to build on subsets of all
> platforms.

Hi Sander,

Thanks for raising this, interestingly I was just now discussing
configuration data with Jürg while receiving your email. Most of this
reply has to do with the topic of exposing configuration options but I
think there are a couple of other points in your email too...

Roughly what we discussed:

  o We have a closing window of opportunity to change things we're
    unhappy with, before things start to harden and stabilize.

  o Currently we have "variants" and "arches" as possible configuration
    entry points

  o As we discussed in Manchester, we probably need some alternative
    way to configure things, allow users to define their own
    conditionals like "debug-level" or such.

  o Variants are interesting because they offer a constraint resolution
    approach to authoring elements. When custom element configurations
    arise, depending elements may assert a compatible configuration
    or remain ambivalent.

  o Variant constraint resolution algorithms are hard

On top of needing to be correct, they need to be performant, which adds to that.
 
  o If the variants semantics are too rigid to express all of
    the kinds of configuration we need, then we need to either
    elaborate on variants or revise our approach to configuration

I really like the idea of variants, so I'm kind of hoping for a sign or
an idea that makes it all 'click' together nicely.

I hear you on this one.  
 
Barring a truly great idea, I am thinking of yanking out variants
altogether and replacing them with a user provided list of variables.

The project.conf would specify a list of available conditional keywords
and also specify some constraints and/or help strings about those
keywords, so buildstream can use that to interact with the user (the
user could provide the options via their user config and maybe on the
command line and interactively).

The downsides I see to this approach are:

  o We no longer address the bigger combinatorial explosion
    of project output. The number of potential project outputs
    are (elements * (options * valid-values-per-option)).

  o An element can no longer assert a specific configuration
    of a dependency.

    So even if a given configuration of one element must be built
    against a specific configuration of it's dependency; there is
    no way for buildstream to raise an error if the maintainer
    accidentally mismatches something one day, and that compatibility
    information is most likely forgotten.

Unless we allow user variable constraints on dependencies?

I guess there are a lot of ways to fry the cat, maybe there is some
middle ground (I did think up something where variables are declared on
elements and elements can configure their dependencies... something
like variants with variables, but I expect it to be overly complex).

That seems to be the first thought that springs to my mind as well.
 
> There may be variants of projects that expose different behaviour. 
> For example, a variant of APR with APR_POOL_DEBUG turned on.
>
> There may also be variants of projects that are available in
> different versions, for instance, a current and a stable variant.

This confuses me.

I think a goal would be to only ever maintain one branch in one branch.

Terminology is going to quickly get in our way...

Are you saying that if say, I have a new version of a lower level dependency (e.g. a library), I should branch the integration candidate repository (containing all the bst files), and then change the library bst to point to that version?
 
There are some difficulties I can see here, for instance in most cases;
just revisioning your modules with proper branches and using `bst
track` is enough to test one or two modules from the next/previous
release cycle in a given build (the ones you work on are usually the
ones you care about in this regard, so let's assume you can predict
branch names and release tags), but this breaks down when for instance
the new/old branch differs in terms of dependencies or build
instructions.

What other obstacles do you see here ?

The obstacle I see has more to do with scale (thousands of projects).  Imagine changing one library to a next version, and doing a test build of everything that depends on it.  Now, say, 20% fails to build, and requires fixing in those 20% of projects before proceeding.  The question then becomes how do you move the 80% forward on the new version?

You could branch the integration candidate, and change the version of the library per above.  This could work, but now imagine having to do this for multiple libraries, once you hit double digits this already starts to explode.  You quickly run into the problem of having to answer the question, what integration candidate do I need to use to build application X to deploy it?
 
Can we discuss that and hopefully find a decent workflow that doesn't
involve maintaining multiple versions of elements in the same branch of
a buildstream project ?

I think this is going to be the inevitable tug of war: balancing between more branches/forks of integration candidates vs more variance inside an integration candidate.  They both have their issues when dealing with more than low single digits.
 
> Now, multiple approaches spring to mind, with up- and downsides.
> There's the option of creating a repository (or branch) per
> platform.  But this comes at the expense of having to keep them in
> sync.
> There's the option of using variants, but this comes at the expense
> of additional specification, potentially in additional .bst files of
> type stack.  This comes with the possibility of a combinatorial
> explosion of variants built-in, which currently would lead to
> performance concerns.
>
> I'm happy to expand more, but am curious what was initially
> envisioned when it comes to managing buildstream repositories?

Originally I was thinking variants and arches all the way down (in
terms of conditionals); and where arches were not good enough, add
other ones (like "os" perhaps).

This seemed to hold together well enough but you've convinced me that
we need something a bit less rigid.

Beyond speaking of conditional YAML blocks and configuration behind
these however is interesting territory but probably too large of a
question (i.e. "how do we envision working with many projects, or many
buildstream projects").

That's fair enough.
 
It would be interesting to throw together a workflow diagram or such
detailing how a large workgroup would manage branches and land patches
to a collection of buildstream projects which come together to create
an appliance;

That statement actually carries something important.  Are you thinking that each integration candidate is only targeting one 'appliance'?  I was thinking more in terms of composing multiple appliances out of a single integration candidate.  Maybe recursive pipelines would alleviate this some, resulting in the common dependencies living in a single integration candidate, and 'appliance' specific integration candidates depending on that...
 
and how these teams interact together and how they try
out builds using feature "wip" branches maintained by separate teams,
etc. Also, it would be interesting to compare user stories for the
corporate many-team scenario with how things would work in the
distributed / floss scenario.

Definitely.  I'll see what I can come up with to help the discussion.
 

Cheers,
    -Tristan

Cheers,

Sander 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]