Re: [BuildStream] Responsive, but not overly verbose UX - not-in-scheduler



Hi Tristan,

(comments inline)

On 2019-06-07 08:28, Tristan Van Berkom wrote:
1. Store all state that the UI needs in a single place.

    To this end, I plan to create a _frontend/uistate.py file, which 
contains the classes `UIState` (stores all state, owned by App and 
passed to Status and LogLine so they can read the state), `TaskGroup` 
(as a proxy for queues, stores its name, and lists of active, processed, 
skipped and failed tasks), and `Task` (a replacement for the 
_StatusJob).

I think this is correct, but I don't want this to be in _frontend/.

Rather, I want this to be owned by the core - think of it this way: 

   "This is state which the core offers to any frontend."

Whether or not we eventually develop other frontends than our CLI
frontend, it makes sense to separate the frontend and core this way, so
that Stream is a thin API surface library which allows any frontend to
be easily developped.

So essentially yes, lets manage state some state in our Stream library
for the sake of frontends, but:

  * Lets refer to this as state, not UI state - the core need not know
    exactly what a frontend might do with this state.

  * Lets keep this state in the core as a frontend facing API, probably
    you should be able to *reach* it through Stream() (that is not to
    say it should be implemented *in* _stream.py specifically, just
    that there should be a way to *obtain* this state from/through
    APIs on Stream()).

Does this separation make sense ?


So, my understanding of how this would work for Queue active/processed/skipped/failed jobs is:

Based on the principles of:
* Stream *is* core, and provides a suitable interface that frontends can subscribe to i.e. in a same-process model they can register callbacks to be notified when the state changes, in a multiple-process model they would be given a python multiprocessing Queue, and would consume messages passed through it, though the details of what a multiple-process model would do don't affect my work at the moment. * Stream should hold all information that frontends need to access (i.e. isn't just a translation layer that accesses other parts of core when it needs to provide information)

The changes to fit this model are:
* A new `State` class (owned by Stream) would hold all the information we want to make available to frontends. * State holds a list of all active jobs (i.e. the job_start_callbacks and job_complete_callbacks that the Scheduler call are now defined by State) * State holds tallies of `TaskGroup`s, which hold tallies of how many jobs have been skipped/processed/failed. - Since we only ever check the length of the skipped/processed lists, they may as well be integer counts stored only in State. * It is possible to register job_start_callbacks and job_complete_callbacks with State, so that frontends can have a list of all active jobs to print. * It is possible to register queue_status_changed_callbacks with State, so that frontends can report queue statistics. * It is possible to register task_groups_changed_callbacks, which report the names of all the queues now used by the Scheduler. * Since our frontend doesn't just print a notification every time something happens, it needs its own local representation of this data.

There are some complications that I'm not so sure about my solutions to:
* `LogLine.print_summary()` will print every failure message for an element if it ended up in a queue's failed_elements. I think this is because an element job may succeed after a retry, and if it does we're not interested in the failure messages. - I think the best way to handle this would be to maintain a separate list of failed jobs (doesn't need to be separated by queue), which the frontend also subscribes to changes to. * `App._handle_failure()` directly removes an element from the list of failed_elements and enqueues it again. - This is going to be something that Tom and Phil have to deal with as well, but in the meantime, this mainly just means that the failed_elements list still needs to exist in Queue's internals.

So, to answer your question, I think it makes sense, and I have a fairly clear idea of what to do for informing the frontend about currently-running jobs and the state of the queues.

Thanks,

Jonathan

--
Jonathan Maw, Software Engineer, Codethink Ltd.
Codethink privacy policy: https://www.codethink.co.uk/privacy.html


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]