Re: Proposal for Remote Execution



Hi,

On Wed, Apr 11, 2018 at 10:37 PM Jürg Billeter <j bitron ch> wrote:
[...]
 
CAS FUSE Layer
~~~~~~~~~~~~~~
As a third phase I'm proposing to introduce a new FUSE layer that allows
safe access to a directory tree stored in CAS without having to extract
it to a regular file system. This is expected to reduce staging time with
local execution. For remote execution this is more important because:
* Workers and CAS may run on different machines, which means that staging
  will be a lot more expensive than our current approach with hard links.
* With build jobs that require execution of multiple actions, the already
  expensive staging would have to be done multiple times.

Added and modified files will initially be stored in a local temporary
directory and eventually stored in CAS.

We saw a huge performance overhead in our initial FUSE tests with pyfuse and
bindfs. However, in the mean time I wrote a prototype FUSE layer in C that
makes use of various optimization possibilities that FUSE supports including
zero copy reads/writes and reduced number of context switches. Initial tests
with this FUSE layer show a reduction of the overhead from factor 2-5 to
just 6% in a parallel workload. With the reduced staging time the overall
time for a build job is expected to be close to the current implementation.

I was actually expecting that the reduced staging time will make the overall build job faster.  Obviously it depends on how much there is to stage.
 
This phase will change the new Sandbox.get_virtual_directory() function to
return a CAS-backed Directory object and the Sandbox implementation will
mount the FUSE layer.

The detail on how to handle the dependency of the C-based FUSE layer is open
for discussion. It might become an optional dependency. If we can hard
depend on the new CAS FUSE layer, we can completely drop the existing pyfuse
SafeHardlinks layer.

I am assuming we will not be using the SafeHardlinks layer when the C-based FUSE layer is in use regardless?
 
The new FUSE layer will also make it possible to track used files for #56,
however, this is not part of this proposal.

I understand that we're keeping this out of scope for this proposal.

Cheers,

Sander
--

Cheers,

Sander


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]