[BuildStream] Frontend/Backend Process Split - Status Update



Hi all,

For quite a while now myself, and before that Phil Dawson have been
working towards introducing multiprocessing to the Buildstream 'main'
process. Essentially the aim, as originally discussed here [1] was to
have every entry point into Stream (say, build, show, pull etc) be
handed off to a separate process, allowing the 'backend' to utilize more
time actually doing work whilst the 'frontend' handled the UI rendering
in the parent. The main push for this came from the WSL1 usecase, where
it could be seen that the terminal performance of printing each status
update was taking up an excessive amount of cpu time of the bst process.

To get to a point where it was possible to implement a PoC for this, a
number of MR's [2] were landed to de-tangle some of the interlaced
UI-Backend logic, primarily where the respective state of relative
objects was required (such as a message in the frontend calling into the
backend to load the plugin table just to get the name of an element).
There was also the addition of the a 'Notification' queue between Stream
& Scheduler, acting as the main communication boundary between the
conceptual 'ends'. With the abstractions in place the aim was to land
the actual multiprocessing machinery standalone, allowing progression
whilst resulting in a more digestible & focused MR. Since starting down
this path there have been a number of parallel changes to the main
codebase that have landed which have somewhat added to the complexity of
having two processes represent the main process, including the
introduction of the State class & having to handle running buildbox-casd
as a child process. It has also required a lot of manual interactive
testing and UI output comparison, something that's not covered currently
via our automated testing (although that should hopefully change soon!)

As it stands [3] a PoC/WIP branch for running bst build under
multiprocessing is at a stage where it's mostly functional in comparison
to master. bst build was chosen as the target entry point, but the
implementation is adaptable. Due to this there is some added complexity
where certain logic is defined by whether or not there is
multiprocessing in the given context. Functionally the build usecase
is operational to produce end to end benchmarks with a complete UI
outputted and at this point it makes sense to determine if progressing
the work is beneficial, especially considering the added code paths
(adding extra multiprocessing & async overheads, subprocessed exception
handling etc whilst providing no benefit to non WSL targets). Signal
handling, including the user interactive interrupt, along with shelling
into a failed build shell are 'functional' on Linux too. It's also worth
noting that this PoC is implemented using a 'fork' process, with the
effort needed to switch to spawn summarised here [4], making it
incompatible with the push towards Win32 support along with any future
targets that only support spawn (Python 3.8 recently changed the default
process to spawn for MacOS).

Looking at the specific target of WSL1 there's some encouraging time
improvements on some systems (*30-40%* quicker buildtimes, for the
synthetic debian stack), however on other systems there's little to no
difference due to external environmental factors. With this in mind, and
the upcoming release of WSL2 which should perform much more inline with
native linux, this effort will be paused in favour of making
optimisations that should not exclusively benefit the WSL1 target.
Changing how & when we display updates to the UI should reduce terminal
overheads, whilst also making the output more digestible & useful to the
user on tasks that execute rapidly.

For now I'll look to split out some small changes in the MR where it
makes sense (including [5] and the current behavior, and flipping the
logic for `is_main_process` to `is_job_process` for clarity), with the
potential for the main body of the multiprocessing to be picked up it's
deemed that it would be beneficial in the future. With that in mind I'd
still very much appreciate any feedback on the MR, especially any
interactive testing or user benchmarks along with comments about the
actual implementation.

Best regards,

Tom


https://mail.gnome.org/archives/buildstream-list/2019-May/msg00011.html
https://gitlab.com/BuildStream/buildstream/issues/1036
https://gitlab.com/BuildStream/buildstream/merge_requests/1613/
https://gitlab.com/BuildStream/buildstream/issues/1160
https://gitlab.com/BuildStream/buildstream/merge_requests/1613/diffs?commit_id=1863ccf2d928d8bf76e696f0f22a447e41670987

-- 
https://www.codethink.co.uk/privacy.html


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]