Re: [BuildStream] Scheduler optimization



On Fri, 2019-03-29 at 12:25 +0000, Benjamin Schubert wrote:
Hey all,

Thanks for the detailed and quick answers.

General observation before I dig in the details: The reason why I'm not going
too deep in the technical details is that I might be overlooking some details
in the scheduler and might need to adapt this when actually implementing the
solution. Moreover, to ensure the good quality of the solution and that we are
on the right track, the merge request will be an easier place to discuss and
adapt the solution.

I understand and agree that there will be discovery during
implementation.

[...]
My greatest fear when I read the above is that an implementation might
end up having Element codepaths accessing the Scheduler.

Please be careful to ensure that doesn't happen - the Element is not
allowed to know what the scheduler is and should not even have any
abstract knowledge of what the queues are.

Agreed, my aim is to make the code and performance better, not to make it
unmaintainable.

Of course, sorry if I came across the wrong way.

There are a lot of people working on this project, and it happens
frequently enough that we only end up discussing the actual
architectural changes after a lot of time is spent in implementation.

Often when that happens, we end up significantly changing the code in
the merge request a lot, and people can be upset by this.

The aim of my message is to bring the high level architectural changes
proposed here to light, and hopefully save time in review cycles this
way.

To maintain clean separation of logic; the elements should be able to
notify their owner (the Stream which ultimately creates them) that some
element related state change has occurred (like that an element has
been "fetched" or "assembled", or maybe also that assemble or fetch has
been scheduled, like _schedule_assemble(), if that turns out to be
important).

I am pretty sure we could move the "_schedule_assemble" function out to the
scheduler, which would make more sense and make everything easier. Thoughts?

Actually the way _schedule_assemble() is called is already a bit
worrisome, it appears to be called only as a side effect of
_update_state(), which (I guess) allows the scheduler (queues) to
observe whether an element that is required is ready to be assembled.

I'm sure whatever refactoring is done in this area will result in
something more sensible.

Once the Stream has received a notification of an element related state
change, it can then delegate the decision of whether to promote the
element into a new queue to the Scheduler or it's queues - but this
business logic of course belongs to the Scheduler, while only element
related state changes are known by the Element.

I was planning to do this more in the queues themselves, which might make the
code easier to read. Would this be good with you a priori?

Certainly this makes sense, I think the queues are a component of the
scheduler.

We will still want the status 3 lists for the UI - and I also think you
probably still need a ready list in order to throttle what jobs can be
launched using the _scheduler/resources.py logic.

Implementation details, but no, we don't need two lists, just one.

The other 3 status lists I was referring to are not important for
optimization, these just record what was processed/failed/skipped,
allowing us to make reports at the end of the session.

I think anyway we're on the same page.

[...]
For the second aspect, of course later queues should have priority over
previous queues and that shouldnt change (this way we start building
things right away instead of waiting for everything to pass through the
pull queue before starting our first build).

That is wrong, we don't need this global ordering. We however need ordering
per resource. And in that space, the only queues for which it matters are
push and src-push. This is because the rest of the queues don't share resources,
and we therefore don't need to care about ordering.

I'm not sure that it is actually wrong.

From the perspective of maximizing available resource usage, you are
correct that it is not the only way - the other perspective is that it
is desirable to have an element get processed completely (i.e. reach
the end of the queues) as soon as possible, whether that is desirable
or not is debatable, but I think it is.

[...]
On Fri, Mar 29, 2019 at 10:20 AM Sander Striker <s striker striker nl> wrote:

Hi,
[...]

Observation: there is some speculation based concern/fear in this
thread.  I prefer we assume that we are all trying to do the right
thing in terms of architecture, code quality and actual usability
(which includes performance).  I am sure that we'll come across
some difficult trade off calls we'll need to make along the way,
but let's do that as we encounter them, rather than preemptively
take a stance.

Again, sorry if this message came across the wrong way.

This proposal is very focused only on the algorithmic changes and
doesnt go far into discussing what actual architectural changes will
take place to achieve this.

As I mentioned above, I think that at least the high level changes of
how the codebase will be (re)organized as a result are worth bringing
to light before writing code.

Getting on the same page from the beginning usually helps us save time
and energy later on.

Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]