[BuildStream] Reporting progress for long-running tasks



Hi all,

I've been looking into the problem of tasks that can run for a long time without any feedback.

The problem
-----------

There are times where you're sitting in front of an idle terminal, with a
risk users don't know if it's busy or stuck.

There are two broad cases where this happens:

1. A job running in the scheduler, e.g. fetching sources or assembling
   artifacts.
2. An activity running in the main process, e.g. Loading elements or resolving
   elements' cached state.

Furthermore, there are broadly three different things we can display for
progress:

1. The current number of subtasks completed, since the total is not knowable. 2. The current and total number of subtasks completed. The total number might increase during the life of the task (e.g. discovering more elements), too. 3. No number can be known for progress (e.g. shelling out to a command), but
   text might be available.

At the end of this mail I've included an appendix of some long-running tasks
and how progress reporting could be added to them.

My proposed solution
--------------------

The basic functionality to support this is:

* Task (from `_state.py`) is extended with a "task changed" callback, which Status (in `_frontend/status.py`) subscribes to, and uses it to decide whether the
  corresponding StatusJob's size has changed (requiring recalculation)
  - StatusJobs will now have an additional field to render progress.
    - If the progress string is set, of the form
      [$timestamp:$name:$progress_string], where the progress string is
      truncated to fit the entire job on one line.
    - If the current and total progress are set, of the form
      [$timestamp:$name:$current/$total]
- This could feasibly be displayed as a percentage instead, but it is unhelpful if the percentage goes down because the total progress has
        increased.
      - In future, it might be interesting to save space by rendering
[$timestamp:$name], but progress is indicated by the background colour.
    - If only the current progress is set, of the form
      [$timestamp:$name:$current]

For activities in the main process:

* Add a new contextmanager, `simple_task`, that yields a Task (as defined in
  `_state.py`), which has methods to:
  - get/set the current progress (a number)
  - get/set the total progress (a number)
  - set a progress string
* This contextmanager extends the `timed_activity` contextmanager (in
`_messenger.py`), but also creates a Task before yielding, and removes the
  Task after yielding.
* In addition, the contextmanager starts a 1-second ticker event loop, so that
  if the activity lasts

For activities in subprocesses:

* Within the Scheduler, the `job_start` callback passes the Task into the Job.
* For ElementJobs, the Task is passed into the Element, too.
* Elements are extended to have methods to set the progress in this Task.


Unfortunately, I won't have time to implement all of this before I'm moved on to a different project, but I hope to be able to be able to put together an
example of this, so that I can feel out some of the details.

Appendix
--------

A non-exhaustive list of activities that might benefit from progress reporting:

* Loading elements - recursively loads .bst files and creates MetaElements. An activity in the main process that can only show the current number completed.
* Resolving elements - recursively converts MetaElements into Elements.
An activity in the main process that can show the current and total numbers. * Resolving cached state - calculates cache keys and checks whether every element is cached, if possible. An activity in the main process that can show
  current and total numbers.
* Staging sources - Extract files from the cache to a directory.
An activity that occurs in subprocesses and could show current and total numbers. (I think, I haven't looked too closely into the details of how I'd know the
  current / total number of files to stage)
* Fetching - Retrieve files from a remote location.
An activity that occurs in subprocesses. Depending on the plugin, might be
  able to do current, maximum, or just a string of the latest output.
* Tracking - Update the source ref to the current version.
An activity that occurs in subprocesses. Depending on the plugin, it might take a trivial amount of time, or take long enough to need to report progress,
  which could take any form.
* Assembly - Most commonly, run commands to create an artifact.
An activity that occurs in subprocesses. Unlikely to be able to provide a meaningful current number of subtasks completed, so showing the last line
  of output is probably appropriate.
* CacheSize - Calculating the amount of space taken up by the artifact cache. This occurs in subprocesses, and can be measured as X of Y, where Y increases
  as more files are discovered.
* Cleanup - Remove files from the artifact cache until it's below a specified
  threshold.
This occurs in subprocesses. Outputting X of Y, where X is current cache
  size and Y is target cache size seems appropriate.

---

Thanks for reading.

Best regards,

Jonathan


--
Jonathan Maw, Software Engineer, Codethink Ltd.
Codethink privacy policy: https://www.codethink.co.uk/privacy.html


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]