Re: [BuildStream] Scheduler optimization



Hey all,

Thanks for the detailed and quick answers.

General observation before I dig in the details: The reason why I'm not going
too deep in the technical details is that I might be overlooking some details
in the scheduler and might need to adapt this when actually implementing the
solution. Moreover, to ensure the good quality of the solution and that we are
on the right track, the merge request will be an easier place to discuss and
adapt the solution.

Now inline responses:

Jonathan Maw:
If I understand you correctly, you intend for every element that's done
being fetched to be added to the BuildQueue.
Currently, the fetch queue has an option to skip all cached elements,
which adds the element to the skipped_elements list for the frontend to
read. AIUI here you'd have to also directly add the element to the
BuildQueue.

Yes, I consider this implementation details. Cached elements will be skipped
as before.


Is there a particular reason you prefer this way of doing it to
deferring the logic of adding elements to the BuildQueue to
Element._schedule_assembly()?

_schedule_assemble also allows the element to start being fetched, the fact that
_schedule_assemble touches both Fetch and BuildQueue is because of the current
poll model and that should not be required in the future.


Tristan Van Berkom
For instance, one of the purposes of `Queue.status()` is to skip
elements without every having launched a job, as Jonathan points out in
the sibling post - I think that is a valuable filter to have as
elements pass through queues - however it probably only has to happen
once per element which enters a queue right ?

Yes, that should be treated at enqueue time.

My greatest fear when I read the above is that an implementation might
end up having Element codepaths accessing the Scheduler.

Please be careful to ensure that doesn't happen - the Element is not
allowed to know what the scheduler is and should not even have any
abstract knowledge of what the queues are.

Agreed, my aim is to make the code and performance better, not to make it
unmaintainable.

To maintain clean separation of logic; the elements should be able to
notify their owner (the Stream which ultimately creates them) that some
element related state change has occurred (like that an element has
been "fetched" or "assembled", or maybe also that assemble or fetch has
been scheduled, like _schedule_assemble(), if that turns out to be
important).

I am pretty sure we could move the "_schedule_assemble" function out to the
scheduler, which would make more sense and make everything easier. Thoughts?

Once the Stream has received a notification of an element related state
change, it can then delegate the decision of whether to promote the
element into a new queue to the Scheduler or it's queues - but this
business logic of course belongs to the Scheduler, while only element
related state changes are known by the Element.

I was planning to do this more in the queues themselves, which might make the
code easier to read. Would this be good with you a priori?

We will still want the status 3 lists for the UI - and I also think you
probably still need a ready list in order to throttle what jobs can be
launched using the _scheduler/resources.py logic.

Implementation details, but no, we don't need two lists, just one.

Of course, once an element ends up in a queue's ready list, one needs
only to consult the resources to know if a job is allowed to run yet,
and one need not traverse the entire list but can just take the next
ready job.

That only holds if you have very few builders, as mentioned previously.


This is pure speculation and it might look different, but I think that
there is still some decision making that is in the domain of a specific
Queue which should still remain that way, and not jumbled into a big
master function which has too much knowledge of how all the individual
queues should work.

Yes, we obviously need to do this. But only once.

Yes, but again I expect this to change significantly - probably we want
to reconsider whether there is really a need for `_schedule_assemble()`
and `_set_required()` to be separate things - instead we probably want
symmetry in events and have a schedule/done event triggered for each
kind of state transforming event.

Again, implementation details, but I would be happy to move this to the
scheduler.


More importantly, when you say: "Add the element to a list in every
dependency that is not ready", I agree with the concept, but I think
that these lists should not necessarily be stored on the elements
proper.

These lists are strictly scheduling related logic and the Element
should not have to know that they exist.

Implementation detail, but I will be careful into not leaking more information
than necessary to the Element.


For the second aspect, of course later queues should have priority over
previous queues and that shouldnt change (this way we start building
things right away instead of waiting for everything to pass through the
pull queue before starting our first build).

That is wrong, we don't need this global ordering. We however need ordering
per resource. And in that space, the only queues for which it matters are
push and src-push. This is because the rest of the queues don't share resources,
and we therefore don't need to care about ordering.

The theory is very sensible and I am happy to see us go in this
direction, as I mentioned at the top; most of my concerns are only
about ensuring sane encapsulation of logic separated into sensibly
isolated modules.

Let's ensure the Merge request that will come is thoroughly reviewed and does
not expose more stuff than necessary.



Jürg Billeter
While we're changing the scheduler architecture, I'm wondering whether
we should also consider implementing/supporting a make job server (or
at least make sure it would be possible to support this in the future).
This would likely improve scheduling quite a bit for local execution.
And the push-based model would make this a much better fit, i.e., it
matches the direction we're going.

While I have nothing against moving towards a job server, I think it would be
best to first refactor the pipeline and then adapt it if necessary for a job
server. The change will already be quite chunky.

"set_required" just means that the artifact is required. It doesn't
mean that the Element needs to be built and thus also fetching may not
be needed. I.e., "schedule_assemble" is probably closer to the right
trigger.

Correct

Can't we use the recently added reverse dependency lists for this, at
least as a starting point? It might be slightly less efficient but we
anyway already notify all reverse dependencies to trigger cache key
calculations, so it might not make a big difference and could be
simpler, especially with regards to the separation between Element and
scheduling logic that Tristan mentioned.

I can try, or use a counter. As long as we end up with a more efficient way,
I'll be happy, and we can refine afterwards.

Once again, thanks everyone for you input!

Ben

On Fri, Mar 29, 2019 at 10:20 AM Sander Striker <s striker striker nl> wrote:

Hi,

On Fri, Mar 29, 2019 at 8:58 AM Jürg Billeter <j bitron ch> wrote:

On Fri, 2019-03-29 at 16:45 +0900, Tristan Van Berkom wrote:
On Fri, 2019-03-29 at 08:30 +0100, Jürg Billeter wrote:
Hi Benjamin,
[...]
While we're changing the scheduler architecture, I'm wondering whether
we should also consider implementing/supporting a make job server (or
at least make sure it would be possible to support this in the future).
This would likely improve scheduling quite a bit for local execution.
And the push-based model would make this a much better fit, i.e., it
matches the direction we're going.

Wouldn't it also make sense to consider job servers in the remote
execution scenario ?

I expect that most remote execution setups involve large servers where
many remote workers are running.

Theoretically, yes. However, the BuildStream scheduler isn't running on
the remote workers, so this would be mostly independent. And it could
raise concerns of scheduling fairness across jobs from different
clients.


Exactly.  BuildStream only has a job here as a remote execution _client_, it should submit _all_ the jobs 
that it can as soon as it can, and should then rely on the remote execution service to execute jobs 
efficiently and fairly.


(That said, I think we're a bit off topic here and think we should be
careful to not lump too many changes into the same activity, it just
makes sense to consider how jobs servers fit into a future where this
scheduler change happens).

Yes, I think it's worth considering when designing the new scheduler
architecture, however, actual job server support would most likely be a
separate (and optional) effort after the core changes.


+1.

Observation: there is some speculation based concern/fear in this thread.  I prefer we assume that we are 
all trying to do the right thing in terms of architecture, code quality and actual usability (which 
includes performance).  I am sure that we'll come across some difficult trade off calls we'll need to make 
along the way, but let's do that as we encounter them, rather than preemptively take a stance.


Cheers,
Jürg


Cheers,

Sander

_______________________________________________
buildstream-list mailing list
buildstream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]