Re: [BuildStream] Responsive, but not overly verbose UX - not-in-scheduler

From: Tristan Van Berkom <tristan vanberkom codethink co uk>
To: Daniel Silverstone <daniel silverstone codethink co uk>, buildstream-list gnome org
Subject: Re: [BuildStream] Responsive, but not overly verbose UX - not-in-scheduler
Date: Fri, 19 Apr 2019 17:51:02 +0900

Hi Daniel,

So first I should provide some background I didn't provide before,
which is that we used to actually have UI feedback showing:

  * A counter of how many files were loaded so far
  * A counter of resolving elements (showing [resolved/total])
  * A counter showing resolved cached state (showing [cached/total])

At load time, before the regular logger kicked in.

We scrapped this dual standard of logging in order to have a more
consistent UI and to improve code reuse, this was long before the
arrival of junctions.

It was a bit challenging to ensure the application initialized things
in the right order, and untangle some spaghetti to ensure we could
initialize the logger earlier on, and I think it would be a shame to
move back to a dual standard of logging.

On Thu, 2019-04-18 at 14:59 +0100, Daniel Silverstone via buildstream-
list wrote:

On Thu, Apr 18, 2019 at 17:22:02 +0900, Tristan Van Berkom via buildstream-list wrote:

The fact that we don't have any feedback about the fetching of
junctions, and also that the user settings (e.g. fetchers) has no
impact on the junction fetching is rather buggy indeed (I use a system
monitor, and I feel rather offended that BuildStream is causing network
activity to occur without telling me about it or allowing me to police
it in a way that is consistent with all other fetch operations).


I'd note that fetch is only half the story.  The staging of the junction source
to enable loading of elements from within it is often even worse for WSL than
cloning the junction's repo in the first place.


Yes, and I'd note additionally that unpoliced fetch is not restricted
to load time either.

There are cases in other commands which either fetch sources or pull
build trees on demand, I don't think any of that code has the right to
run outside of the scheduler, unbounded by the configuration settings,
and more relevant to this proposal; without the regular user feedback
associated to these tasks.

This is what I had in mind in order to fix these two problems:

  * Pop up the status bar right away instead of deferring it


By "status bar" do you mean what I think of as the scheduler UI ?


It is amusing to me that you call this the "scheduler UI" :)

Sure, the status bar needs some enhancements and a better API such that
it can be used more freely.

As far as I can see, the status bar has 4 components:

  * The Header:

    ~~~~~~~~~~~~ (elements to process / total elements) ~~~~~~~~~~~~~~

  * The Queues:

    (Track: 0 0 0) -> (Pull: 0 0 0) -> (Build: 0 0 0) -> (Push: 0 0 0)

  * The separator with cache usage:

    ~~~~~~~~~ cache 67% ~~~~~~~~~

  * The height for width job space at the bottom


So, we need to be able to have the Header behave differently, and
perhaps have what is displayed in the middle settable with an API on
the Status class.

We need a way to disable/enable the printing of the Queues, because we
probably don't want to show the Queues when only fetching something.

We probably need something in between the Scheduler's Job class and the
Status.add_job() / Status.remove_job() API, so that the Scheduler
doesnt expect every job to be a scheduler job.

I.e. we would want an abstract object which might be backed by a
Scheduler Job, or it might be implemented as a different kind of task.

To spell that out more clearly, lets just call it a "Task":

  * When the Scheduler signals that a job has started, then
    the Stream creates a Task for the frontend to see.

  * The the Stream does any kind of task, like loading elements or
    resolving cached state, it also creates a different kind of Task
    for the frontend to see.

  * Since the scheduler is not running when loading objects, there
    needs to be some ticking controlled by the Stream or App to update
    the UI periodically.

    I believe that the Loader still has the legacy 'ticker' callback
    which we never removed from back in the days that we did provide
    this kind of feedback, which can be used to tell the status bar
    to refresh itself while the load process is ongoing.

  * Have the Loader trigger a callback mentioning that it requires a
    fetch operation (or multiple fetch operations) to be performed.

  * Stream run the scheduler on it's behalf to fetch the junction


If this essentially means we start the scheduler always and treat all longer
running operations semi-equally, then I think it could be an interesting way to
implement the goal of my proposal.  I didn't want to suggest it because the
scheduler seemed so focussed on progressing fully realised elements from one
end of a semi-rigid pipeline to the other.


It definitely means this yes.

I don't think "start the scheduler always" is the right wording of this
though. Rather, it should be very fine to have Stream() call
Scheduler.run() multiple times.

This of course means a little bit of refactor, but really not that
much, currently there is an assumption that Scheduler.run() will run
only once, and it is really only related to the UI, which could do with
a bit more decoupling anyway.

It should be feasible to fetch as many junctions which were encountered
in the first pass load of one project at once in parallel, rather than
serializing potential fetches at once (all while respecting the
configuration of how many fetches the user has allowed to occur in
parallel).


Since we (currently) load elements depth-first, we don't encounter junctions in
any kind of parallel way as far as I can tell.


Yes, continuing to do it serially at first is certainly an option, and
I wouldnt expect an initial implementation to solve every problem at
once.

However, as we already have a dual pass load process for the loader, it
doesnt seem far fetched that we could extend this to fetch all
junctions on a per-project-depth level in parallel (in the case of a
single project having multiple junctions, we fetch em' all, and then do
it for the subprojects recursive junctions after, etc).

This way, if we enhance progress indicators in some way in the future,
it can be applied equally and consistently to all operations which
support progress reporting (Source plugins could optionally support
such a reporting API one day, the loader might provide a counter of how
many elements were loaded so far, etc).


If we allowed a progress stream to come back from jobs toward the scheduler's
UI then yes that would be a not-unreasonable way to do this.


That has been vaguely on the roadmap forever, everyone thinks it would
be very nice to display richer progress information for ongoing tasks
where possible, it's just not a priority to implement.

Still, it would be good to position the codebase in such a way that it
is easy to implement in a uniform fashion, the moment that someone
actually has time to do it :)

What do you think, is there a reason we need to add an additional
logging mechanism for early logging that I overlooked, or a reason that
it would be more desirable ?


If you're happy with the idea of breaking the scheduler away from its rigid
focus on only progressing elements through a fixed pipeline then this could be
a viable approach.  There'd need to be some work done to allow for jobs which
run entirely within the parent rather than always farming out to subprocesses,
though it's possible we already have that with the cache jobs (I don't know
either way).


The scheduler does not have a rigid focus on progressing elements
through a fixed pipeline.

The scheduler just processes Queues, and the Stream decides what Queues
to provide the scheduler in what order, every time that it runs the
Scheduler.

Also, I consider the fact that we inconsistently run jobs like "fetch"
or "pull" in the main process without properly going through the
regular scheduler APIs to be a bug. I don't expect fixing this to cause
any negative side effects.

The *critical* thing here is that the goal is to *ALWAYS* provide an indication
of progress toward the user's desired outcome within a few seconds, and to
never block output entirely during that time.  Ultimately I don't mind how
that's achieved so long as it's consistent and easy to add anywhere we end up
with a long-running job to do.


Of course, we should.

Cheers,
    -Tristan

References:
- [BuildStream] Responsive, but not overly verbose UX - not-in-scheduler
  - From: Daniel Silverstone
- Re: [BuildStream] Responsive, but not overly verbose UX - not-in-scheduler
  - From: Tristan Van Berkom
- Re: [BuildStream] Responsive, but not overly verbose UX - not-in-scheduler
  - From: Daniel Silverstone

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]