[Notes] [Git][BuildStream/buildstream][jmac/remote_execution

Jim MacArthur pushed to branch jmac/remote_execution_client at BuildStream / buildstream

Commits:

f08d3e8e
by Jim MacArthur at 2018-08-21T17:10:54Z
```
_casbaseddirectory.py: add _save() function.
```

a4e05b5a

by Jim MacArthur at 2018-08-21T17:10:54Z

sandbox.py: Allow setting the virtual directory

This is for use after remote execution has finished, since remote
execution produces a new output directory rather than modifying
the initial directory.

e46d54ab

by Jim MacArthur at 2018-08-21T17:10:54Z

Add "remote-execution" project configuration option.

This just adds one option, "remote-execution/url". Affects multiple files.

a2746372

by Jim MacArthur at 2018-08-21T17:10:54Z

doc/source/format_project.rst: Document remote-execution option

d3a80c92
by Jim MacArthur at 2018-08-21T17:10:54Z
```
_sandboxremote.py: New file.
```

a90056c6

by Jim MacArthur at 2018-08-21T17:10:54Z

_sandboxremote: Fix copyright and some style things

e4b33ac0
by Jim MacArthur at 2018-08-21T17:10:54Z
```
sandbox/__init__.py: Add SandboxRemote
```

5d095b3c

by Jim MacArthur at 2018-08-21T17:10:54Z

element.py: Get the updated virtual directory after running.

Executing run() on a sandbox can now replace the virtual directory,
since remote execution returns a potentially different directory rather
than an update to the existing one. Call get_virtual_directory() again
after running to accout for this.

9ef7e8ec

by Jim MacArthur at 2018-08-21T17:10:54Z

autotools.py: Mark this as a BST_VIRTUAL_DIRECTORY plugin

1c02b30c

by Jim MacArthur at 2018-08-21T17:10:54Z

element.py: Switch to SandboxRemote if config option is set

10 changed files:

buildstream/_project.py
buildstream/buildelement.py
buildstream/data/projectconfig.yaml
buildstream/element.py
buildstream/plugins/elements/autotools.py
buildstream/sandbox/__init__.py
+ buildstream/sandbox/_sandboxremote.py
buildstream/sandbox/sandbox.py
buildstream/storage/_casbaseddirectory.py
doc/source/format_project.rst

Changes:

buildstream/_project.py

@@ -129,6 +129,7 @@ class Project():
          self.artifact_cache_specs = None
          self._sandbox = None
 +        self._remote_execution = None
          self._splits = None
          self._context.add_project(self)
@@ -460,7 +461,7 @@ class Project():
              'aliases', 'name',
              'artifacts', 'options',
              'fail-on-overlap', 'shell', 'fatal-warnings',
 -            'ref-storage', 'sandbox', 'mirrors'
 +            'ref-storage', 'sandbox', 'mirrors', 'remote-execution'
          ])
+         #
@@ -478,6 +479,9 @@ class Project():
          # Load sandbox configuration
          self._sandbox = _yaml.node_get(config, Mapping, 'sandbox')
 +        # Load remote execution configuration
 +        self._remote_execution = _yaml.node_get(config, Mapping, 'remote-execution')
++
          # Load project split rules
          self._splits = _yaml.node_get(config, Mapping, 'split-rules')

buildstream/buildelement.py

@@ -155,6 +155,9 @@ class BuildElement(Element):
              command_dir = build_root
          sandbox.set_work_directory(command_dir)
 +        # Tell sandbox which directory is preserved in the finished artifact
 +        sandbox.set_output_directory(install_root)
++
          # Setup environment
          sandbox.set_environment(self.get_environment())

buildstream/data/projectconfig.yaml

@@ -204,3 +204,6 @@ shell:
    # Command to run when `bst shell` does not provide a command
+   #
    command: [ 'sh', '-i' ]
++
 +remote-execution:
 +  url: ""
 \ No newline at end of file

buildstream/element.py

@@ -95,6 +95,7 @@ from . import _site
  from ._platform import Platform
  from .plugin import CoreWarnings
  from .sandbox._config import SandboxConfig
 +from .sandbox._sandboxremote import SandboxRemote
  from .storage.directory import Directory
  from .storage._filebaseddirectory import FileBasedDirectory
@@ -250,6 +251,9 @@ class Element(Plugin):
          # Extract Sandbox config
          self.__sandbox_config = self.__extract_sandbox_config(meta)
 +        # Extract remote execution URL
 +        self.__remote_execution_url = self.__extract_remote_execution_config(meta)
++
      def __lt__(self, other):
          return self.name < other.name
@@ -1545,6 +1549,8 @@ class Element(Plugin):
                  finally:
                      if collect is not None:
                          try:
 +                            # Sandbox will probably have replaced its virtual directory, so get it again
 +                            sandbox_vroot = sandbox.get_virtual_directory()
                              collectvdir = sandbox_vroot.descend(collect.lstrip(os.sep).split(os.sep))
                          except VirtualDirectoryError:
                              # No collect directory existed
@@ -2117,7 +2123,24 @@ class Element(Plugin):
          project = self._get_project()
          platform = Platform.get_platform()
 -        if directory is not None and os.path.exists(directory):
 +        if self.__remote_execution_url and self.BST_VIRTUAL_DIRECTORY:
 +            if not self.__artifacts.has_push_remotes(element=self):
 +                # Give an early warning if remote execution will not work
 +                raise ElementError("Artifact {} is configured to use remote execution but has no push remotes. "
 +                                   .format(self.name) +
 +                                   "The remote artifact server(s) may not be correctly configured or contactable.")
++
 +            self.info("Using a remote 'sandbox' for artifact {}".format(self.name))
 +            sandbox = SandboxRemote(context, project,
 +                                    directory,
 +                                    stdout=stdout,
 +                                    stderr=stderr,
 +                                    config=config,
 +                                    server_url=self.__remote_execution_url,
 +                                    allow_real_directory=False)
 +            yield sandbox
 +        elif directory is not None and os.path.exists(directory):
 +            self.info("Using a local sandbox for artifact {}".format(self.name))
              sandbox = platform.create_sandbox(context, project,
                                                directory,
                                                stdout=stdout,
@@ -2289,6 +2312,18 @@ class Element(Plugin):
          return SandboxConfig(self.node_get_member(sandbox_config, int, 'build-uid'),
                               self.node_get_member(sandbox_config, int, 'build-gid'))
 +    def __extract_remote_execution_config(self, meta):
 +        if self.__is_junction:
 +            return ''
 +        else:
 +            project = self._get_project()
 +            project.ensure_fully_loaded()
 +            if project._remote_execution:
 +                rexec_config = _yaml.node_chain_copy(project._remote_execution)
 +                return self.node_get_member(rexec_config, str, 'url')
 +            else:
 +                return ''
++
      # This makes a special exception for the split rules, which
      # elements may extend but whos defaults are defined in the project.
+     #

buildstream/plugins/elements/autotools.py

@@ -57,7 +57,7 @@ from buildstream import BuildElement
  # Element implementation for the 'autotools' kind.
  class AutotoolsElement(BuildElement):
 -    pass
 +    BST_VIRTUAL_DIRECTORY = True
  # Plugin entry point

buildstream/sandbox/__init__.py

@@ -20,3 +20,4 @@
  from .sandbox import Sandbox, SandboxFlags
  from ._sandboxchroot import SandboxChroot
  from ._sandboxbwrap import SandboxBwrap
 +from ._sandboxremote import SandboxRemote

buildstream/sandbox/_sandboxremote.py

 +#!/usr/bin/env python3
 +#
 +#  Copyright (C) 2018 Bloomberg LP
 +#
 +#  This program is free software; you can redistribute it and/or
 +#  modify it under the terms of the GNU Lesser General Public
 +#  License as published by the Free Software Foundation; either
 +#  version 2 of the License, or (at your option) any later version.
 +#
 +#  This library is distributed in the hope that it will be useful,
 +#  but WITHOUT ANY WARRANTY; without even the implied warranty of
 +#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
 +#  Lesser General Public License for more details.
 +#
 +#  You should have received a copy of the GNU Lesser General Public
 +#  License along with this library. If not, see <http://www.gnu.org/licenses/>.
 +#
 +#  Authors:
 +#        Jim MacArthur <jim macarthur codethink co uk>
++
 +import os
 +import re
++
 +import grpc
++
 +from . import Sandbox
 +from ..storage._filebaseddirectory import FileBasedDirectory
 +from ..storage._casbaseddirectory import CasBasedDirectory
 +from .._protos.build.bazel.remote.execution.v2 import remote_execution_pb2, remote_execution_pb2_grpc
 +from .._artifactcache.cascache import CASCache
++
++
 +class SandboxError(Exception):
 +    pass
++
++
 +# SandboxRemote()
 +#
 +# This isn't really a sandbox, it's a stub which sends all the source
 +# to a remote server and retrieves the results from it.
 +#
 +class SandboxRemote(Sandbox):
++
 +    def __init__(self, *args, **kwargs):
 +        super().__init__(*args, **kwargs)
 +        self.cascache = None
 +        self.server_url = kwargs['server_url']
 +        # Check the format of the url ourselves to save the user from
 +        # whatever error messages grpc will produce
 +        m = re.match(r'^(.+):(\d+)$', self.server_url)
 +        if m is None:
 +            raise SandboxError("Configured remote URL '{}' does not match the expected layout. "
 +                               .format(self.server_url) +
 +                               "It should be of the form <protocol>://<domain name>:<port>.")
++
 +    def _get_cascache(self):
 +        if self.cascache is None:
 +            self.cascache = CASCache(self._get_context())
 +            self.cascache.setup_remotes(use_config=True)
 +        return self.cascache
++
 +    def __run_remote_command(self, cascache, command, input_root_digest, environment):
++
 +        environment_variables = [remote_execution_pb2.Command.
 +                                 EnvironmentVariable(name=k, value=v)
 +                                 for (k, v) in environment.items()]
++
 +        # Create and send the Command object.
 +        remote_command = remote_execution_pb2.Command(arguments=command, environment_variables=environment_variables,
 +                                                      output_files=[],
 +                                                      output_directories=[self._output_directory],
 +                                                      platform=None)
 +        command_digest = cascache.add_object(buffer=remote_command.SerializeToString())
 +        command_ref = 'worker-command/{}'.format(command_digest.hash)
 +        cascache.set_ref(command_ref, command_digest)
++
 +        command_push_successful = cascache.push_refs([command_ref], self._get_project(), may_have_dependencies=False)
 +        if not command_push_successful and not cascache.verify_key_pushed(command_ref, self._get_project()):
 +            # Command push failed
 +            return None
++
 +        # Create and send the action.
++
 +        action = remote_execution_pb2.Action(command_digest=command_digest,
 +                                             input_root_digest=input_root_digest,
 +                                             timeout=None,
 +                                             do_not_cache=True)
++
 +        action_digest = cascache.add_object(buffer=action.SerializeToString())
 +        action_ref = 'worker-action/{}'.format(command_digest.hash)
 +        cascache.set_ref(action_ref, action_digest)
 +        action_push_successful = cascache.push_refs([action_ref], self._get_project(), may_have_dependencies=False)
++
 +        if not action_push_successful and not cascache.verify_key_pushed(action_ref, self._get_project()):
 +            # Action push failed
 +            return None
++
 +        # Next, try to create a communication channel to the BuildGrid server.
++
 +        channel = grpc.insecure_channel(self.server_url)
 +        stub = remote_execution_pb2_grpc.ExecutionStub(channel)
 +        request = remote_execution_pb2.ExecuteRequest(instance_name='default',
 +                                                      action_digest=action_digest,
 +                                                      skip_cache_lookup=True)
++
 +        operation_iterator = stub.Execute(request)
 +        operation = None
 +        with self._get_context().timed_activity("Waiting for the remote build to complete"):
 +            # It is advantageous to check operation_iterator.code() is grpc.StatusCode.OK here,
 +            # which will check the server is actually contactable. However, calling it when the
 +            # server is available seems to cause .code() to hang forever.
 +            for operation in operation_iterator:
 +                if operation.done:
 +                    break
 +        return operation
++
 +    def process_job_output(self, output_directories, output_files):
 +        # output_directories is an array of OutputDirectory objects.
 +        # output_files is an array of OutputFile objects.
 +        #
 +        # We only specify one output_directory, so it's an error
 +        # for there to be any output files or more than one directory at the moment.
++
 +        if output_files:
 +            raise SandboxError("Output files were returned when we didn't request any.")
 +        elif len(output_directories) > 1:
 +            error_text = "More than one output directory was returned from the build server: {}"
 +            raise SandboxError(error_text.format(output_directories))
 +        elif len(output_directories) < 1:  # pylint: disable=len-as-condition
 +            error_text = "No output directory was returned from the build server."
 +            raise SandboxError(error_text)
++
 +        digest = output_directories[0].tree_digest
 +        if digest is None or digest.hash is None or digest.hash == "":
 +            raise SandboxError("Output directory structure had no digest attached.")
++
 +        # Now do a pull to ensure we have the necessary parts.
 +        cascache = self._get_cascache()
 +        cascache.pull_key(digest.hash, digest.size_bytes, self._get_project())
 +        path_components = os.path.split(self._output_directory)
++
 +        # Now what we have is a digest for the output. Once we return, the calling process will
 +        # attempt to descend into our directory and find that directory, so we need to overwrite
 +        # that.
++
 +        if not path_components:
 +            # The artifact wants the whole directory; we could just return the returned hash in its
 +            # place, but we don't have a means to do that yet.
 +            raise SandboxError("Unimplemented: Output directory is empty or equal to the sandbox root.")
++
 +        # At the moment, we will get the whole directory back in the first directory argument and we need
 +        # to replace the sandbox's virtual directory with that. Creating a new virtual directory object
 +        # from another hash will be interesting, though...
++
 +        new_dir = CasBasedDirectory(self._get_context(), ref=digest)
 +        self._set_virtual_directory(new_dir)
++
 +    def run(self, command, flags, *, cwd=None, env=None):
 +        # Upload sources
 +        upload_vdir = self.get_virtual_directory()
++
 +        if isinstance(upload_vdir, FileBasedDirectory):
 +            # Make a new temporary directory to put source in
 +            upload_vdir = CasBasedDirectory(self._get_context(), ref=None)
 +            upload_vdir.import_files(self.get_virtual_directory()._get_underlying_directory())
++
 +        # Now, push that key (without necessarily needing a ref) to the remote.
 +        cascache = self._get_cascache()
++
 +        ref = 'worker-source/{}'.format(upload_vdir.ref.hash)
 +        upload_vdir._save(ref)
 +        source_push_successful = cascache.push_refs([ref], self._get_project())
++
 +        # Set up environment and PWD
 +        if env is None:
 +            env = self._get_environment()
 +        if 'PWD' not in env:
 +            env['PWD'] = self._get_work_directory()
++
 +        # We want command args as a list of strings
 +        if isinstance(command, str):
 +            command = [command]
++
 +        # Now transmit the command to execute
 +        if source_push_successful or cascache.verify_key_pushed(ref, self._get_project()):
 +            response = self.__run_remote_command(cascache, command, upload_vdir.ref, env)
++
 +            if response is None:
 +                # Failure of remote execution, usually due to an error in BuildStream
 +                # NB This error could be raised in __run_remote_command
 +                raise SandboxError("No response returned from server")
++
 +            assert(response.HasField("error") or response.HasField("response"))
++
 +            if response.HasField("error"):
 +                # A normal error during the build: the remote execution system
 +                # has worked correctly but the command failed.
 +                # response.error also contains 'message' (str) and 'details'
 +                # (iterator of Any) which we ignore at the moment.
 +                return response.error.code
 +            else:
++
 +                # At the moment, response can either be an
 +                # ExecutionResponse containing an ActionResult, or an
 +                # ActionResult directly.
 +                executeResponse = remote_execution_pb2.ExecuteResponse()
 +                if response.response.Is(executeResponse.DESCRIPTOR):
 +                    # Unpack ExecuteResponse and set response to its response
 +                    response.response.Unpack(executeResponse)
 +                    response = executeResponse
++
 +                actionResult = remote_execution_pb2.ActionResult()
 +                if response.response.Is(actionResult.DESCRIPTOR):
 +                    response.response.Unpack(actionResult)
 +                    self.process_job_output(actionResult.output_directories, actionResult.output_files)
 +                else:
 +                    raise SandboxError("Received unknown message from server (expected ExecutionResponse).")
 +        else:
 +            raise SandboxError("Failed to verify that source has been pushed to the remote artifact cache.")
 +        return 0

buildstream/sandbox/sandbox.py

@@ -99,9 +99,11 @@ class Sandbox():
          self.__stdout = kwargs['stdout']
          self.__stderr = kwargs['stderr']
 -        # Setup the directories. Root should be available to subclasses, hence
 -        # being single-underscore. The others are private to this class.
 +        # Setup the directories. Root and output_directory should be
 +        # available to subclasses, hence being single-underscore. The
 +        # others are private to this class.
          self._root = os.path.join(directory, 'root')
 +        self._output_directory = None
          self.__directory = directory
          self.__scratch = os.path.join(self.__directory, 'scratch')
          for directory_ in [self._root, self.__scratch]:
@@ -144,11 +146,17 @@ class Sandbox():
                  self._vdir = FileBasedDirectory(self._root)
          return self._vdir
 +    def _set_virtual_directory(self, vdir):
 +        """ Sets virtual directory. Useful after remote execution
 +        has rewritten the working directory.
 +        """
 +        self._vdir = vdir
++
      def set_environment(self, environment):
          """Sets the environment variables for the sandbox
          Args:
 -           directory (dict): The environment variables to use in the sandbox
 +           environment (dict): The environment variables to use in the sandbox
          """
          self.__env = environment
@@ -160,6 +168,15 @@ class Sandbox():
          """
          self.__cwd = directory
 +    def set_output_directory(self, directory):
 +        """Sets the output directory - the directory which is preserved
 +        as an artifact after assembly.
++
 +        Args:
 +           directory (str): An absolute path within the sandbox
 +        """
 +        self._output_directory = directory
++
      def mark_directory(self, directory, *, artifact=False):
          """Marks a sandbox directory and ensures it will exist

buildstream/storage/_casbaseddirectory.py

@@ -561,3 +561,20 @@ class CasBasedDirectory(Directory):
          throw an exception. """
          raise VirtualDirectoryError("_get_underlying_directory was called on a CAS-backed directory," +
                                      " which has no underlying directory.")
++
 +    def _save(self, name):
 +        """Saves this directory into the content cache as a named ref. Used
 +        by remote execution to make references for source directories so they
 +        can be pushed to a remote artifact server.
++
 +        """
 +        self._recalculate_recursing_up()
 +        self._recalculate_recursing_down()
 +        (rel_refpath, refname) = os.path.split(name)
 +        refdir = os.path.join(self.cas_directory, 'refs', 'heads', rel_refpath)
 +        refname = os.path.join(refdir, refname)
++
 +        if not os.path.exists(refdir):
 +            os.makedirs(refdir)
 +        with open(refname, "wb") as f:
 +            f.write(self.ref.SerializeToString())

doc/source/format_project.rst

@@ -204,6 +204,23 @@ with an artifact share.
  You can also specify a list of caches here; earlier entries in the list
  will have higher priority than later ones.
 +Remote execution
 +~~~~~~~~~~~~~~~~
 +Buildstream supports remote execution using the Google Remote Execution API
 +(REAPI). A description of how remote execution works is beyond the scope
 +of this document, but you can specify a remote server complying with the REAPI
 +using the `remote-execution` option:
++
 +.. code:: yaml
++
 +  remote-execution:
++
 +    # A url defining a remote execution server
 +    url: buildserver.example.com:50051
++
 +The url should be a hostname and port separated by ':'. Do not include a protocol.
++
 +The Remote Execution API can be found via https://github.com/bazelbuild/remote-apis.
  .. _project_essentials_mirrors:

[Notes] [Git][BuildStream/buildstream][jmac/remote_execution_client] 10 commits: _casbaseddirectory.py: add _save() function.

Jim MacArthur pushed to branch jmac/remote_execution_client at BuildStream / buildstream

Commits:

10 changed files:

Changes: