Re: [BuildStream] Responsive, but not overly verbose UX - not-in-scheduler



On Wed, 2019-05-29 at 16:32 +0100, Jonathan Maw wrote:
On 2019-05-29 08:34, Tristan Van Berkom wrote:
Hi Jonathan,


[snip]


As far as I can see, rushing forward to implement this early logging
both:

  * Potentially regresses architectural corrections we made by ensuring
    that we do *not* have any dual standards/codepaths for logging.

  * Adds more burden to the larger task of process separation by adding
    yet more codepaths to deal with in this separation, instead of
    standardizing on a common API path for logging.

Please reconsider this.

Cheers,
    -Tristan


Hi Tristan,

Thanks for your architectural concerns, I wouldn't want to cause 
architectural regressions here, either.

Do I understand correctly that the correct standard logging path for 
reporting progress here would be in `Context.message()`?

The status bar is where we display "what is currently going on", the
Context.message() API is where we log events which occurred.


Here's a little breakdown of what the frontend currently uses to
display information to the user. Some of these, the frontend should
really never have had access to, but we never took the time to
implement proper abstractions.

I think this is the time for those abstractions to happen, as it will
assist in your objective, and be outright necessary for frontend and
scheduler process separation anyway.

 * Context.message()

   This is how log lines are ultimately generated, and the frontend
   _widget.py code has a renderer specifically for these messages.

   This won't have to change much for process separation, although we
   may stop communicating unique_id as the frontend may no longer have
   access to the data model (or an updated one).

 * Overall scheduler and queue statues

   This includes information such as:
   - How many elements overall are loaded in the session
   - How many of these elements need to be processed in this session
   - How many elements have been processed, skipped or failed in each
     queue

   The frontend really should never be poking it's nose all the way
   down into the scheduler's business, this should be communicated to
   the frontend more explicitly with an API provided by Stream().

   There are multiple reasons for this, but even without process
   separation, we should be maintaining the this information and
   context is in an API surface crafted for the frontend, we should
   have the scheduler updating that and be able to refactor the
   scheduler without impacting the frontend implicitly.

   With process separation this of course becomes a necessity.

 * Cache usage information

   A fairly recent addition, will also require some API on Stream()
   for the frontend with process separation in mind.

 * Tasks being displayed in the status bar.

   This is where we show actual ongoing progress of ongoing tasks.

   Currently this only supports timers, but could be extended such
   that tasks can have counters or "fractions", like: "42 / 657".

   These tasks rendered in the status bar are currently driven directly
   by scheduler jobs, which, similar to scheduler and queue statuses
   listed above; the frontend should really never have been observing
   directly at all, and will anyway be impossible with the incoming
   process separation.

   This is where I think any loading tasks should be represented, as
   tasks, tasks need not necessarily be backed by scheduler jobs; if
   they are provided in an abstract way by Stream as they anyway will
   *have* to be for process separation, then these abstract tasks can
   be driven however the Stream desires to drive them: They represent
   tasks to be displayed to the user while something is ongoing, for
   the purpose of displaying some progress.

I think that sorting out some abstraction and getting the status bar in
_frontend/status.py to never access the Scheduler, but to always
display tasks provided by Stream directly, is really the first step
towards both being able to represent progress of loading elements in a
uniform way, and also towards being able to display a frontend at all
where the scheduler is going to be in a separate process.

Also, based on what I know about the work to make the UI run in a 
separate process, I think that will cause there to be a regular frontend 
running at the point where we're loading elements.

Minor correction here: the frontend is always running, it is the main
entry point and it is in control of everything at all times: The
frontend is essentially god.

Rather you mean to say that the scheduling / worker process will
already be started by the time we start loading elements.

I think this is going to be true and I think it is desirable yes,
however it wont necessarily be true.


*However*, and this is an important detail to capture:

    Whether the scheduling process is already forked or not, and
    whether the function Scheduler.run() is currently being run or not,
    has absolutely no bearing at all on whether the frontend is
    currently displaying a status bar or not.

The status bar is where we display various ongoing progress, the log
lines is where we log "things that happened", clearly the status bar
needs to just pop up earlier, if we want to display ongoing progress
before scheduler jobs start getting processed.

 Would it be enough 
here to make sure that I keep in discussion with the people working on 
!1038 to make sure I'm using the same codepath as them, or are there 
other details I haven't taken into account?

It sounds like reporting when we load a junction will be tricker than I 
guessed. I'll make sure I'm on the same page as the people working on 
!1038, and have a look at whether Scheduler needs any work to run 
`Scheduler.run()` repeatedly in a clean way.

That should really be the least of your problems, running the scheduler
should never be assumed to be a one off thing - it's possible that such
an assumption has caused the code to grow a mutant limb or two, but I'm
certain that it wont be difficult at all to correct those mutations.

Actually mostly, I would surmise that any of these assumptions are
limited to the frontend, which has a notion about "sessions" and has a
callback in place when Scheduler.run() completes which it uses to
display a summary - that is just a little bit of frontend code which
probably needs a bit of adjustment (the current callback approach isn't
all that clean anyway).

Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]