Re: [BuildStream] Proposal: A small number of subprocesses handling jobs



Hi,

On Mon, 2019-03-04 at 15:50 +0000, Jonathan Maw via buildstream-list wrote:
[...]
We discussed the pool of subprocesses, and came to broadly two different 
ways to go about it:

1. A multiprocessing pool that forks off at the start of the scheduler.
===

* Changes to the element in the lifetime of a job will be captured, and 
passed through the job's result object when the job finishes.
* Changes from a job result will be received by the scheduler and pushed 
to each worker subprocess.
* Mandate that only the element that the job is running for can be 
changed in the job
   - A "soft" mandate (changes will not be propagated to the other 
workers) is enough for normal operation, but a separate mode where such 
changes are forbidden (or any changes outside the element are thrown 
away) would be useful for debugging.
* A "pristine" subprocess that is unchanged by jobs would be useful for 
forking new subprocesses, especially if we decide that each worker 
should have a finite lifetime.

2. An element graph service
===

i.e. one subprocess holds the pipeline and uses some form of IPC to get 
and set state changes.

This is a valuable long-term goal to work towards, once we have a much 
better idea of where/when we access the element graph, and have 
completely encapsulated every time a plugin would access the element 
graph.
At this point, that process will be acutely affected by any slowness in 
`_update_state()`, and if plugin authors can affect this then we will 
have to hope/impress/demand that this should have a small time impact.
(As an aside, this is currently not the case. git-based plugins 
implement `validate_cache()` and fork off a git subprocess to find the 
branch and tag. using libgit2 here would be valuable)

Both of these drastic changes incur some measure of overhead.

Where is the evidence that these overheads are going to be less than
the overhead of simply forking on demand ?

Without this evidence, we should definitely simply stay with option 3,
which is to not make any changes, the code is simpler now than it would
be if either of the above proposed routes were to be followed.

While this means that the core must be careful to tip-toe around
libraries which inadvertently spawn threads, this tip-toeing is not
necessary from the plugin code[1]. I strongly believe fork-on-demand is
the lesser evil in terms of overall complexity in the core.

Cheers,
   -Tristan

[1]: Plugin code here refers to code controlled by plugin authors, not
private core business logic encoded into the base classes.

And the statement above is actually inaccurate, but it would not be
very difficult to ensure that plugin code is only ever invoked from a
subprocess, ensuring that plugins need not worry about thread-spawning
libraries in their own code.

Further, the majority of calls into plugin code from the main process
are Source.get_consistency() (to interrogate the Source plugin owned
cache directory initially) - we should be able to significantly
optimize the load times by parallelizing these interrogations.

Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]