[Notes] [Git][BuildGrid/buildgrid][mablanch/83-executed-action-metadata] 24 commits: setup.py: Unpin pytest version but require >= 3.8.0



Title: GitLab

Martin Blanchard pushed to branch mablanch/83-executed-action-metadata at BuildGrid / buildgrid

Commits:

22 changed files:

Changes:

  • .gitlab-ci.yml
    ... ... @@ -43,7 +43,6 @@ tests-debian-stretch:
    43 43
       <<: *linux-tests
    
    44 44
     
    
    45 45
     run-dummy-job-debian:
    
    46
    -  image: buildstream/buildstream-debian
    
    47 46
       <<: *dummy-job
    
    48 47
     
    
    49 48
     
    

  • .pylintrc
    ... ... @@ -455,8 +455,12 @@ known-standard-library=
    455 455
     
    
    456 456
     # Force import order to recognize a module as part of a third party library.
    
    457 457
     known-third-party=boto3,
    
    458
    +                  click,
    
    458 459
                       enchant,
    
    459
    -                  grpc
    
    460
    +                  google,
    
    461
    +                  grpc,
    
    462
    +                  moto,
    
    463
    +                  yaml
    
    460 464
     
    
    461 465
     
    
    462 466
     [DESIGN]
    

  • CONTRIBUTING.rst
    1
    -
    
    2 1
     .. _contributing:
    
    3 2
     
    
    4 3
     Contributing
    
    ... ... @@ -18,9 +17,13 @@ discuss with us before submitting anything, as we may well have some important
    18 17
     context which will could help guide your efforts.
    
    19 18
     
    
    20 19
     Any major feature additions should be raised first as a proposal on the
    
    21
    -`BuildGrid mailing list`_ to be discussed, and then eventually followed up with
    
    22
    -an issue on GitLab. We recommend that you propose the feature in advance of
    
    23
    -commencing work.
    
    20
    +BuildGrid mailing list to be discussed between the core contributors. Once 
    
    21
    +this discussion has taken place and there is agreement on how to proceed, 
    
    22
    +it should be followed by with a Gitlab issue being raised which summarizes 
    
    23
    +the tasks required.
    
    24
    +
    
    25
    +We strongly recommend that you propose the feature in advance of
    
    26
    +commencing any work.
    
    24 27
     
    
    25 28
     The author of any patch is expected to take ownership of that code and is to
    
    26 29
     support it for a reasonable time-frame. This means addressing any unforeseen
    
    ... ... @@ -204,17 +207,16 @@ Committer access
    204 207
     We'll hand out commit access to anyone who has successfully landed a single
    
    205 208
     patch to the code base. Please request this via Slack or the mailing list.
    
    206 209
     
    
    207
    -This of course relies on contributors being responsive and show willingness to
    
    208
    -address problems after landing branches there should not be any problems here.
    
    210
    +This of course relies on contributors being responsive and showing willingness 
    
    211
    +to address any problems that may arise after landing branches.
    
    209 212
     
    
    210
    -What we are expecting of committers here in general is basically to escalate the
    
    211
    -review in cases of uncertainty:
    
    213
    +When submitting a merge request, please obtain a review from another committer 
    
    214
    +who is familiar with the area of the code base which the branch effects. An 
    
    215
    +approval from another committer who is not the patch author will be needed 
    
    216
    +before any merge (we use gitlab's 'approval' feature for this).
    
    212 217
     
    
    213
    -- If the change is very trivial (obvious few line changes, typos…), and you are
    
    214
    -  confident of the change, there is no need for review.
    
    215
    -- If the change is non trivial, please obtain a review from another committer
    
    216
    -  who is familiar with the area which the branch effects. An approval from
    
    217
    -  someone who is not the patch author will be needed before any merge.
    
    218
    +What we are expecting of committers here in general is basically to escalate the
    
    219
    +review in cases of uncertainty.
    
    218 220
     
    
    219 221
     .. note::
    
    220 222
     
    
    ... ... @@ -239,21 +241,49 @@ following goals:
    239 241
       for the viewer to digest.
    
    240 242
     - Ensure that we keep it simple and easy to contribute to the project.
    
    241 243
     
    
    242
    -We are currenlty using the following GitLab features:
    
    244
    +Explanation of how the project is currenlty using some GitLab features:
    
    243 245
     
    
    244 246
     - `Milestones`_: we have seen them used in the same way as `Epics`_ in other
    
    245
    -  projects. BuildGrid milestones must be time-line based, can overlap and we can
    
    246
    -  be working towards multiple milestones at any one time. They allow us to group
    
    247
    -  together all sub tasks into an overall aim. See our `BuildGrid milestones`_.
    
    248
    -- `Labels`_: allow us to filter tickets in useful ways. They do complexity and
    
    249
    -  effort as they grow in number and usage, though, so the general approach is
    
    250
    -  to have the minimum possible. See our `BuildGrid labels`_.
    
    247
    +  projects and are trying not to do that here. Instead we are going to 
    
    248
    +  use milestones to denote development cycles (ie, two week 'sprints'). See the
    
    249
    +  `BuildGrid milestones`_.
    
    250
    +- `Labels`_: allow us to filter tickets (ie, 'issues' in gitlab terminology)
    
    251
    +  in useful ways. They add complexity and effort as they grow in number, so the
    
    252
    +  general approach is to have the minimum possible but 
    
    253
    +  ensure we use them consistently. See the `BuildGrid labels`_. 
    
    251 254
     - `Boards`_: allow us to visualise and manage issues and labels in a simple way.
    
    252
    -  For now, we are only utilising one boards. Issues start life in the
    
    253
    -  ``Backlog`` column by default, and we move them into ``ToDo`` when they are
    
    254
    -  coming up in the next few weeks. ``Doing`` is only for when an item is
    
    255
    -  currently being worked on. Moving an issue from column to column automatically
    
    256
    -  adjust the tagged labels. See our `BuildGrid boards`_.
    
    255
    +  Issues start life in the ``Backlog`` column by default, and we move them into
    
    256
    +  ``ToDo`` when we aim to complete them in the current development cycle.
    
    257
    +  ``Doing`` is only for when an item is currently being worked on. When on the
    
    258
    +  Board view, dragging and dropping an issue from column to column automatically
    
    259
    +  adjusts the relevant labels. See the `BuildGrid boards`_.
    
    260
    +  
    
    261
    +  
    
    262
    +Guidelines for using GitLab features when working on this project: 
    
    263
    +  
    
    264
    +- When raising an issue, please:
    
    265
    +   
    
    266
    +  - check to see if there already is an issue to cover this task (if not then 
    
    267
    +    raise a new one)
    
    268
    +  - assign the appropriate label or labels (tip: the vast majority of issues 
    
    269
    +    raised will be either an enhancement or a bug)
    
    270
    +    
    
    271
    +- If you plan to work on an issue, please:
    
    272
    +
    
    273
    +  - self-assign the ticket
    
    274
    +  - ensure it's captured in the current sprint (ie, Gitlab milestone)
    
    275
    +  - ensure the ticket is in the ``ToDo`` column of the board if you aim to 
    
    276
    +    complete in the current sprint but aren't yet working on it, and
    
    277
    +    the ``Doing`` column if you are working on it currently.
    
    278
    +
    
    279
    +- Please note that Gitlab issues are for either 'tasks' or 'bugs' - ie not for 
    
    280
    +  long discussions (where the mailing list is a better choice) or for ranting, 
    
    281
    +  for example.
    
    282
    +  
    
    283
    +The above may seem like a lot to take in, but please don't worry about getting 
    
    284
    +it right the first few times. The worst that can happen is that you'll get a 
    
    285
    +friendly message from a current contributor who explains the process. We welcome
    
    286
    +and value all contributions to the project!  
    
    257 287
     
    
    258 288
     .. _Milestones: https://docs.gitlab.com/ee/user/project/milestones
    
    259 289
     .. _Epics: https://docs.gitlab.com/ee/user/group/epics
    

  • Dockerfile
    1
    +FROM python:3.5-stretch
    
    2
    +
    
    3
    +# Point the path to where buildgrid gets installed
    
    4
    +ENV PATH=$PATH:/root/.local/bin/
    
    5
    +
    
    6
    +# Upgrade python modules
    
    7
    +RUN python3 -m pip install --upgrade setuptools pip
    
    8
    +
    
    9
    +# Use /app as the current working directory
    
    10
    +WORKDIR /app
    
    11
    +
    
    12
    +# Copy the repo contents (source, config files, etc) in the WORKDIR
    
    13
    +COPY . .
    
    14
    +
    
    15
    +# Install BuildGrid
    
    16
    +RUN pip install --user --editable .
    
    17
    +
    
    18
    +# Entry Point of the image (should get an additional argument from CMD, the path to the config file)
    
    19
    +ENTRYPOINT ["bgd", "-v", "server", "start"]
    
    20
    +
    
    21
    +# Default config file (used if no CMD specified when running)
    
    22
    +CMD ["buildgrid/_app/settings/default.yml"]
    
    23
    +

  • README.rst
    1
    -
    
    2 1
     .. _about:
    
    3 2
     
    
    4 3
     About
    
    ... ... @@ -14,13 +13,15 @@ BuildGrid is a Python remote execution service which implements Google's
    14 13
     `Remote Execution API`_ and the `Remote Workers API`_. The project's goal is to
    
    15 14
     be able to execute build jobs remotely on a grid of computers in order to
    
    16 15
     massively speed up build times. Workers on the grid should be able to run with
    
    17
    -different environments. It is designed to work with clients such as `Bazel`_ and 
    
    18
    -`BuildStream`_.
    
    16
    +different environments. It works with clients such as `Bazel`_, 
    
    17
    +`BuildStream`_ and `RECC`_, and is designed to be able to work with any client
    
    18
    +that conforms to the above API protocols.
    
    19 19
     
    
    20 20
     .. _Remote Execution API: https://github.com/bazelbuild/remote-apis
    
    21 21
     .. _Remote Workers API: https://docs.google.com/document/d/1s_AzRRD2mdyktKUj2HWBn99rMg_3tcPvdjx3MPbFidU/edit#heading=h.1u2taqr2h940
    
    22 22
     .. _BuildStream: https://wiki.gnome.org/Projects/BuildStream
    
    23 23
     .. _Bazel: https://bazel.build
    
    24
    +.. _RECC: https://gitlab.com/bloomberg/recc
    
    24 25
     
    
    25 26
     
    
    26 27
     .. _getting-started:
    
    ... ... @@ -49,7 +50,7 @@ Resources
    49 50
     
    
    50 51
     .. _Homepage: https://buildgrid.build
    
    51 52
     .. _GitLab repository: https://gitlab.com/BuildGrid/buildgrid
    
    52
    -.. _Bug tracking: https://gitlab.com/BuildGrid/buildgrid/issues
    
    53
    +.. _Bug tracking: https://gitlab.com/BuildGrid/buildgrid/boards
    
    53 54
     .. _Mailing list: https://lists.buildgrid.build/cgi-bin/mailman/listinfo/buildgrid
    
    54 55
     .. _Slack channel: https://buildteamworld.slack.com/messages/CC9MKC203
    
    55 56
     .. _invite link: https://join.slack.com/t/buildteamworld/shared_invite/enQtMzkxNzE0MDMyMDY1LTRmZmM1OWE0OTFkMGE1YjU5Njc4ODEzYjc0MGMyOTM5ZTQ5MmE2YTQ1MzQwZDc5MWNhODY1ZmRkZTE4YjFhNjU

  • buildgrid/_app/bots/buildbox.py
    ... ... @@ -17,8 +17,6 @@ import os
    17 17
     import subprocess
    
    18 18
     import tempfile
    
    19 19
     
    
    20
    -from google.protobuf import any_pb2
    
    21
    -
    
    22 20
     from buildgrid.client.cas import download, upload
    
    23 21
     from buildgrid._exceptions import BotError
    
    24 22
     from buildgrid._protos.build.bazel.remote.execution.v2 import remote_execution_pb2
    
    ... ... @@ -29,13 +27,14 @@ from buildgrid.utils import read_file, write_file
    29 27
     def work_buildbox(context, lease):
    
    30 28
         """Executes a lease for a build action, using buildbox.
    
    31 29
         """
    
    32
    -
    
    33 30
         local_cas_directory = context.local_cas
    
    34 31
         # instance_name = context.parent
    
    35 32
         logger = context.logger
    
    36 33
     
    
    37 34
         action_digest = remote_execution_pb2.Digest()
    
    35
    +
    
    38 36
         lease.payload.Unpack(action_digest)
    
    37
    +    lease.result.Clear()
    
    39 38
     
    
    40 39
         with download(context.cas_channel) as downloader:
    
    41 40
             action = downloader.get_message(action_digest,
    
    ... ... @@ -131,10 +130,7 @@ def work_buildbox(context, lease):
    131 130
     
    
    132 131
                 action_result.output_directories.extend([output_directory])
    
    133 132
     
    
    134
    -            action_result_any = any_pb2.Any()
    
    135
    -            action_result_any.Pack(action_result)
    
    136
    -
    
    137
    -            lease.result.CopyFrom(action_result_any)
    
    133
    +            lease.result.Pack(action_result)
    
    138 134
     
    
    139 135
         return lease
    
    140 136
     
    

  • buildgrid/_app/bots/dummy.py
    ... ... @@ -16,9 +16,18 @@
    16 16
     import random
    
    17 17
     import time
    
    18 18
     
    
    19
    +from buildgrid._protos.build.bazel.remote.execution.v2 import remote_execution_pb2
    
    20
    +
    
    19 21
     
    
    20 22
     def work_dummy(context, lease):
    
    21 23
         """ Just returns lease after some random time
    
    22 24
         """
    
    25
    +    lease.result.Clear()
    
    26
    +
    
    23 27
         time.sleep(random.randint(1, 5))
    
    28
    +
    
    29
    +    action_result = remote_execution_pb2.ActionResult()
    
    30
    +
    
    31
    +    lease.result.Pack(action_result)
    
    32
    +
    
    24 33
         return lease

  • buildgrid/_app/bots/host.py
    ... ... @@ -17,8 +17,6 @@ import os
    17 17
     import subprocess
    
    18 18
     import tempfile
    
    19 19
     
    
    20
    -from google.protobuf import any_pb2
    
    21
    -
    
    22 20
     from buildgrid.client.cas import download, upload
    
    23 21
     from buildgrid._protos.build.bazel.remote.execution.v2 import remote_execution_pb2
    
    24 22
     from buildgrid.utils import output_file_maker, output_directory_maker
    
    ... ... @@ -27,12 +25,13 @@ from buildgrid.utils import output_file_maker, output_directory_maker
    27 25
     def work_host_tools(context, lease):
    
    28 26
         """Executes a lease for a build action, using host tools.
    
    29 27
         """
    
    30
    -
    
    31 28
         instance_name = context.parent
    
    32 29
         logger = context.logger
    
    33 30
     
    
    34 31
         action_digest = remote_execution_pb2.Digest()
    
    32
    +
    
    35 33
         lease.payload.Unpack(action_digest)
    
    34
    +    lease.result.Clear()
    
    36 35
     
    
    37 36
         with tempfile.TemporaryDirectory() as temp_directory:
    
    38 37
             with download(context.cas_channel, instance=instance_name) as downloader:
    
    ... ... @@ -122,9 +121,6 @@ def work_host_tools(context, lease):
    122 121
     
    
    123 122
                 action_result.output_directories.extend(output_directories)
    
    124 123
     
    
    125
    -        action_result_any = any_pb2.Any()
    
    126
    -        action_result_any.Pack(action_result)
    
    127
    -
    
    128
    -        lease.result.CopyFrom(action_result_any)
    
    124
    +        lease.result.Pack(action_result)
    
    129 125
     
    
    130 126
         return lease

  • buildgrid/_app/commands/cmd_execute.py
    ... ... @@ -169,6 +169,7 @@ def run_command(context, input_root, commands, output_file, output_directory):
    169 169
     
    
    170 170
                 downloader.download_file(output_file_response.digest, path)
    
    171 171
     
    
    172
    -            if output_file_response.path in output_executeables:
    
    173
    -                st = os.stat(path)
    
    174
    -                os.chmod(path, st.st_mode | stat.S_IXUSR)
    172
    +    for output_file_response in execute_response.result.output_files:
    
    173
    +        if output_file_response.path in output_executeables:
    
    174
    +            st = os.stat(path)
    
    175
    +            os.chmod(path, st.st_mode | stat.S_IXUSR)

  • buildgrid/server/bots/instance.py
    ... ... @@ -66,10 +66,10 @@ class BotsInterface:
    66 66
             self._bot_sessions[name] = bot_session
    
    67 67
             self.logger.info("Created bot session name=[{}] with bot_id=[{}]".format(name, bot_id))
    
    68 68
     
    
    69
    -        # For now, one lease at a time.
    
    70
    -        lease = self._scheduler.create_lease()
    
    71
    -        if lease:
    
    72
    -            bot_session.leases.extend([lease])
    
    69
    +        # TODO: Send worker capabilities to the scheduler!
    
    70
    +        leases = self._scheduler.request_job_leases({})
    
    71
    +        if leases:
    
    72
    +            bot_session.leases.extend(leases)
    
    73 73
     
    
    74 74
             return bot_session
    
    75 75
     
    
    ... ... @@ -85,11 +85,11 @@ class BotsInterface:
    85 85
             del bot_session.leases[:]
    
    86 86
             bot_session.leases.extend(leases)
    
    87 87
     
    
    88
    -        # For now, one lease at a time
    
    88
    +        # TODO: Send worker capabilities to the scheduler!
    
    89 89
             if not bot_session.leases:
    
    90
    -            lease = self._scheduler.create_lease()
    
    91
    -            if lease:
    
    92
    -                bot_session.leases.extend([lease])
    
    90
    +            leases = self._scheduler.request_job_leases({})
    
    91
    +            if leases:
    
    92
    +                bot_session.leases.extend(leases)
    
    93 93
     
    
    94 94
             self._bot_sessions[name] = bot_session
    
    95 95
             return bot_session
    
    ... ... @@ -109,7 +109,8 @@ class BotsInterface:
    109 109
             if server_state == LeaseState.PENDING:
    
    110 110
     
    
    111 111
                 if client_state == LeaseState.ACTIVE:
    
    112
    -                self._scheduler.update_job_lease_state(client_lease.id, client_lease.state)
    
    112
    +                self._scheduler.update_job_lease_state(client_lease.id,
    
    113
    +                                                       LeaseState.ACTIVE)
    
    113 114
                 elif client_state == LeaseState.COMPLETED:
    
    114 115
                     # TODO: Lease was rejected
    
    115 116
                     raise NotImplementedError("'Not Accepted' is unsupported")
    
    ... ... @@ -122,8 +123,10 @@ class BotsInterface:
    122 123
                     pass
    
    123 124
     
    
    124 125
                 elif client_state == LeaseState.COMPLETED:
    
    125
    -                self._scheduler.update_job_lease_state(client_lease.id, client_lease.state)
    
    126
    -                self._scheduler.job_complete(client_lease.id, client_lease.result, client_lease.status)
    
    126
    +                self._scheduler.update_job_lease_state(client_lease.id,
    
    127
    +                                                       LeaseState.COMPLETED,
    
    128
    +                                                       lease_status=client_lease.status,
    
    129
    +                                                       lease_result=client_lease.result)
    
    127 130
                     return None
    
    128 131
     
    
    129 132
                 else:
    

  • buildgrid/server/execution/instance.py
    ... ... @@ -48,12 +48,15 @@ class ExecutionInstance:
    48 48
             if not action:
    
    49 49
                 raise FailedPreconditionError("Could not get action from storage.")
    
    50 50
     
    
    51
    -        job = Job(action_digest, action.do_not_cache, message_queue)
    
    51
    +        job = Job(action, action_digest)
    
    52
    +        if message_queue is not None:
    
    53
    +            job.register_client(message_queue)
    
    54
    +
    
    52 55
             self.logger.info("Operation name: [{}]".format(job.name))
    
    53 56
     
    
    54
    -        self._scheduler.append_job(job, skip_cache_lookup)
    
    57
    +        self._scheduler.queue_job(job, skip_cache_lookup)
    
    55 58
     
    
    56
    -        return job.get_operation()
    
    59
    +        return job.operation
    
    57 60
     
    
    58 61
         def register_message_client(self, name, queue):
    
    59 62
             try:
    

  • buildgrid/server/job.py
    ... ... @@ -11,151 +11,230 @@
    11 11
     # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    
    12 12
     # See the License for the specific language governing permissions and
    
    13 13
     # limitations under the License.
    
    14
    -#
    
    15
    -# Authors:
    
    16
    -#        Finn Ball <finn ball codethink co uk>
    
    14
    +
    
    17 15
     
    
    18 16
     import logging
    
    19 17
     import uuid
    
    20 18
     from enum import Enum
    
    21 19
     
    
    22
    -from google.protobuf import any_pb2
    
    20
    +from from google.protobuf import timestamp_pb2
    
    23 21
     
    
    24 22
     from buildgrid._protos.build.bazel.remote.execution.v2 import remote_execution_pb2
    
    25 23
     from buildgrid._protos.google.devtools.remoteworkers.v1test2 import bots_pb2
    
    26 24
     from buildgrid._protos.google.longrunning import operations_pb2
    
    27 25
     
    
    28 26
     
    
    29
    -class ExecuteStage(Enum):
    
    27
    +class OperationStage(Enum):
    
    28
    +    # Initially unknown stage.
    
    30 29
         UNKNOWN = remote_execution_pb2.ExecuteOperationMetadata.Stage.Value('UNKNOWN')
    
    31
    -
    
    32 30
         # Checking the result against the cache.
    
    33 31
         CACHE_CHECK = remote_execution_pb2.ExecuteOperationMetadata.Stage.Value('CACHE_CHECK')
    
    34
    -
    
    35 32
         # Currently idle, awaiting a free machine to execute.
    
    36 33
         QUEUED = remote_execution_pb2.ExecuteOperationMetadata.Stage.Value('QUEUED')
    
    37
    -
    
    38 34
         # Currently being executed by a worker.
    
    39 35
         EXECUTING = remote_execution_pb2.ExecuteOperationMetadata.Stage.Value('EXECUTING')
    
    40
    -
    
    41 36
         # Finished execution.
    
    42 37
         COMPLETED = remote_execution_pb2.ExecuteOperationMetadata.Stage.Value('COMPLETED')
    
    43 38
     
    
    44 39
     
    
    45
    -class BotStatus(Enum):
    
    46
    -    BOT_STATUS_UNSPECIFIED = bots_pb2.BotStatus.Value('BOT_STATUS_UNSPECIFIED')
    
    47
    -
    
    48
    -    # The bot is healthy, and will accept leases as normal.
    
    49
    -    OK = bots_pb2.BotStatus.Value('OK')
    
    50
    -
    
    51
    -    # The bot is unhealthy and will not accept new leases.
    
    52
    -    UNHEALTHY = bots_pb2.BotStatus.Value('UNHEALTHY')
    
    53
    -
    
    54
    -    # The bot has been asked to reboot the host.
    
    55
    -    HOST_REBOOTING = bots_pb2.BotStatus.Value('HOST_REBOOTING')
    
    56
    -
    
    57
    -    # The bot has been asked to shut down.
    
    58
    -    BOT_TERMINATING = bots_pb2.BotStatus.Value('BOT_TERMINATING')
    
    59
    -
    
    60
    -
    
    61 40
     class LeaseState(Enum):
    
    41
    +    # Initially unknown state.
    
    62 42
         LEASE_STATE_UNSPECIFIED = bots_pb2.LeaseState.Value('LEASE_STATE_UNSPECIFIED')
    
    63
    -
    
    64 43
         # The server expects the bot to accept this lease.
    
    65 44
         PENDING = bots_pb2.LeaseState.Value('PENDING')
    
    66
    -
    
    67 45
         # The bot has accepted this lease.
    
    68 46
         ACTIVE = bots_pb2.LeaseState.Value('ACTIVE')
    
    69
    -
    
    70 47
         # The bot is no longer leased.
    
    71 48
         COMPLETED = bots_pb2.LeaseState.Value('COMPLETED')
    
    72
    -
    
    73 49
         # The bot should immediately release all resources associated with the lease.
    
    74 50
         CANCELLED = bots_pb2.LeaseState.Value('CANCELLED')
    
    75 51
     
    
    76 52
     
    
    77 53
     class Job:
    
    78 54
     
    
    79
    -    def __init__(self, action_digest, do_not_cache=False, message_queue=None):
    
    80
    -        self.lease = None
    
    55
    +    def __init__(self, action, action_digest):
    
    81 56
             self.logger = logging.getLogger(__name__)
    
    82
    -        self.n_tries = 0
    
    83
    -        self.result = None
    
    84
    -        self.result_cached = False
    
    85 57
     
    
    86
    -        self._action_digest = action_digest
    
    87
    -        self._do_not_cache = do_not_cache
    
    88
    -        self._execute_stage = ExecuteStage.UNKNOWN
    
    89 58
             self._name = str(uuid.uuid4())
    
    90
    -        self._operation = operations_pb2.Operation(name=self._name)
    
    91
    -        self._operation_update_queues = []
    
    59
    +        self._action = remote_execution_pb2.Action()
    
    60
    +        self._operation = operations_pb2.Operation()
    
    61
    +        self._lease = None
    
    62
    +
    
    63
    +        self.__execute_response = None
    
    64
    +        self.__operation_metadata = remote_execution_pb2.ExecuteOperationMetadata()
    
    65
    +        self.__queued_timestamp = timestamp_pb2.Timestamp()
    
    66
    +        self.__worker_start_timestamp = timestamp_pb2.Timestamp()
    
    67
    +        self.__worker_completed_timestamp = timestamp_pb2.Timestamp()
    
    92 68
     
    
    93
    -        if message_queue is not None:
    
    94
    -            self.register_client(message_queue)
    
    69
    +        self.__operation_metadata.action_digest.CopyFrom(action_digest)
    
    70
    +        self.__operation_metadata.stage = OperationStage.UNKNOWN.value
    
    71
    +
    
    72
    +        self._action.CopyFrom(action)
    
    73
    +        self._do_not_cache = self._action.do_not_cache
    
    74
    +        self._operation_update_queues = []
    
    75
    +        self._operation.name = self._name
    
    76
    +        self._operation.done = False
    
    77
    +        self._n_tries = 0
    
    95 78
     
    
    96 79
         @property
    
    97 80
         def name(self):
    
    98 81
             return self._name
    
    99 82
     
    
    83
    +    @property
    
    84
    +    def do_not_cache(self):
    
    85
    +        return self._do_not_cache
    
    86
    +
    
    87
    +    @property
    
    88
    +    def action(self):
    
    89
    +        return self._action
    
    90
    +
    
    100 91
         @property
    
    101 92
         def action_digest(self):
    
    102
    -        return self._action_digest
    
    93
    +        return self.__operation_metadata.action_digest
    
    103 94
     
    
    104 95
         @property
    
    105
    -    def do_not_cache(self):
    
    106
    -        return self._do_not_cache
    
    96
    +    def action_result(self):
    
    97
    +        if self.__execute_response is not None:
    
    98
    +            return self.__execute_response.result
    
    99
    +        else:
    
    100
    +            return None
    
    101
    +
    
    102
    +    @property
    
    103
    +    def operation(self):
    
    104
    +        return self._operation
    
    107 105
     
    
    108
    -    def check_job_finished(self):
    
    109
    -        if not self._operation_update_queues:
    
    110
    -            return self._operation.done
    
    111
    -        return False
    
    106
    +    @property
    
    107
    +    def operation_stage(self):
    
    108
    +        return OperationStage(self.__operation_metadata.state)
    
    109
    +
    
    110
    +    @property
    
    111
    +    def lease(self):
    
    112
    +        return self._lease
    
    113
    +
    
    114
    +    @property
    
    115
    +    def lease_state(self):
    
    116
    +        if self._lease is not None:
    
    117
    +            return LeaseState(self._lease.state)
    
    118
    +        else:
    
    119
    +            return None
    
    120
    +
    
    121
    +    @property
    
    122
    +    def n_tries(self):
    
    123
    +        return self._n_tries
    
    124
    +
    
    125
    +    @property
    
    126
    +    def n_clients(self):
    
    127
    +        return len(self._operation_update_queues)
    
    112 128
     
    
    113 129
         def register_client(self, queue):
    
    130
    +        """Subscribes to the job's :class:`Operation` stage change events.
    
    131
    +
    
    132
    +        Args:
    
    133
    +            queue (queue.Queue): the event queue to register.
    
    134
    +        """
    
    114 135
             self._operation_update_queues.append(queue)
    
    115
    -        queue.put(self.get_operation())
    
    136
    +        queue.put(self._operation)
    
    116 137
     
    
    117 138
         def unregister_client(self, queue):
    
    139
    +        """Unsubscribes to the job's :class:`Operation` stage change events.
    
    140
    +
    
    141
    +        Args:
    
    142
    +            queue (queue.Queue): the event queue to unregister.
    
    143
    +        """
    
    118 144
             self._operation_update_queues.remove(queue)
    
    119 145
     
    
    120
    -    def get_operation(self):
    
    121
    -        self._operation.metadata.CopyFrom(self._pack_any(self.get_operation_meta()))
    
    122
    -        if self.result is not None:
    
    123
    -            self._operation.done = True
    
    124
    -            response = remote_execution_pb2.ExecuteResponse(result=self.result,
    
    125
    -                                                            cached_result=self.result_cached)
    
    146
    +    def set_cached_result(self, action_result):
    
    147
    +        """Allows specifying an action result form the action cache for the job.
    
    148
    +        """
    
    149
    +        self.__execute_response = remote_execution_pb2.ExecuteResponse()
    
    150
    +        self.__execute_response.result.CopyFrom(action_result)
    
    151
    +        self.__execute_response.cached_result = True
    
    126 152
     
    
    127
    -            if not self.result_cached:
    
    128
    -                response.status.CopyFrom(self.lease.status)
    
    153
    +    def create_lease(self):
    
    154
    +        """Emits a new :class:`Lease` for the job.
    
    129 155
     
    
    130
    -            self._operation.response.CopyFrom(self._pack_any(response))
    
    156
    +        Only one :class:`Lease` can be emitted for a given job. This method
    
    157
    +        should only be used once, any furhter calls are ignored.
    
    158
    +        """
    
    159
    +        if self._lease is not None:
    
    160
    +            return None
    
    131 161
     
    
    132
    -        return self._operation
    
    162
    +        self._lease = bots_pb2.Lease()
    
    163
    +        self._lease.id = self._name
    
    164
    +        self._lease.payload.Pack(self.__operation_metadata.action_digest)
    
    165
    +        self._lease.state = LeaseState.PENDING.value
    
    133 166
     
    
    134
    -    def get_operation_meta(self):
    
    135
    -        meta = remote_execution_pb2.ExecuteOperationMetadata()
    
    136
    -        meta.stage = self._execute_stage.value
    
    137
    -        meta.action_digest.CopyFrom(self._action_digest)
    
    167
    +        return self._lease
    
    138 168
     
    
    139
    -        return meta
    
    169
    +    def update_lease_state(self, state, status=None, result=None):
    
    170
    +        """Operates a state transition for the job's current :class:Lease.
    
    140 171
     
    
    141
    -    def create_lease(self):
    
    142
    -        action_digest = self._pack_any(self._action_digest)
    
    172
    +        Args:
    
    173
    +            state (LeaseState): the lease state to transition to.
    
    174
    +            status (google.rpc.Status): the lease execution status, only
    
    175
    +                required if `state` is `COMPLETED`.
    
    176
    +            result (google.protobuf.Any): the lease execution result, only
    
    177
    +                required if `state` is `COMPLETED`.
    
    178
    +        """
    
    179
    +        if state.value == self._lease.state:
    
    180
    +            return
    
    143 181
     
    
    144
    -        lease = bots_pb2.Lease(id=self.name,
    
    145
    -                               payload=action_digest,
    
    146
    -                               state=LeaseState.PENDING.value)
    
    147
    -        self.lease = lease
    
    148
    -        return lease
    
    182
    +        self._lease.state = state.value
    
    149 183
     
    
    150
    -    def get_operations(self):
    
    151
    -        return operations_pb2.ListOperationsResponse(operations=[self.get_operation()])
    
    184
    +        if self._lease.state == LeaseState.PENDING.value:
    
    185
    +            self.__worker_start_timestamp.Clear()
    
    186
    +            self.__worker_completed_timestamp.Clear()
    
    152 187
     
    
    153
    -    def update_execute_stage(self, stage):
    
    154
    -        self._execute_stage = stage
    
    155
    -        for queue in self._operation_update_queues:
    
    156
    -            queue.put(self.get_operation())
    
    188
    +            self._lease.status.Clear()
    
    189
    +            self._lease.result.Clear()
    
    157 190
     
    
    158
    -    def _pack_any(self, pack):
    
    159
    -        some_any = any_pb2.Any()
    
    160
    -        some_any.Pack(pack)
    
    161
    -        return some_any
    191
    +        elif self._lease.state == LeaseState.ACTIVE.value:
    
    192
    +            self.__worker_start_timestamp.GetCurrentTime()
    
    193
    +
    
    194
    +        elif self._lease.state == LeaseState.COMPLETED.value:
    
    195
    +            self.__worker_completed_timestamp.GetCurrentTime()
    
    196
    +
    
    197
    +            action_result = remote_execution_pb2.ActionResult()
    
    198
    +
    
    199
    +            # TODO: Make a distinction between build and bot failures!
    
    200
    +            if status.code != 0:
    
    201
    +                self._do_not_cache = True
    
    202
    +
    
    203
    +            if result is not None:
    
    204
    +                assert result.Is(action_result.DESCRIPTOR)
    
    205
    +                result.Unpack(action_result)
    
    206
    +
    
    207
    +            action_result.execution_metadata.queued_timestamp.CopyFrom(self.__worker_start_timestamp)
    
    208
    +            action_result.execution_metadata.worker_start_timestamp.CopyFrom(self.__worker_start_timestamp)
    
    209
    +            action_result.execution_metadata.worker_completed_timestamp.CopyFrom(self.__worker_completed_timestamp)
    
    210
    +
    
    211
    +            self.__execute_response = remote_execution_pb2.ExecuteResponse()
    
    212
    +            self.__execute_response.result.CopyFrom(action_result)
    
    213
    +            self.__execute_response.cached_result = False
    
    214
    +            self.__execute_response.status.CopyFrom(status)
    
    215
    +
    
    216
    +    def update_operation_stage(self, stage):
    
    217
    +        """Operates a stage transition for the job's :class:Operation.
    
    218
    +
    
    219
    +        Args:
    
    220
    +            stage (OperationStage): the operation stage to transition to.
    
    221
    +        """
    
    222
    +        if stage.value == self.__operation_metadata.stage:
    
    223
    +            return
    
    224
    +
    
    225
    +        self.__operation_metadata.stage = stage.value
    
    226
    +
    
    227
    +        if self.__operation_metadata.stage == OperationStage.QUEUED.value:
    
    228
    +            if not self.__queued_timestamp:
    
    229
    +                self.__queued_timestamp.GetCurrentTime()
    
    230
    +            self._n_tries += 1
    
    231
    +
    
    232
    +        elif self.__operation_metadata.stage == OperationStage.COMPLETED.value:
    
    233
    +            if self.__execute_response is not None:
    
    234
    +                self._operation.response.Pack(self.__execute_response)
    
    235
    +            self._operation.done = True
    
    236
    +
    
    237
    +        self._operation.metadata.Pack(self.__operation_metadata)
    
    238
    +
    
    239
    +        for queue in self._operation_update_queues:
    
    240
    +            queue.put(self._operation)

  • buildgrid/server/operations/instance.py
    ... ... @@ -22,6 +22,7 @@ An instance of the LongRunningOperations Service.
    22 22
     import logging
    
    23 23
     
    
    24 24
     from buildgrid._exceptions import InvalidArgumentError
    
    25
    +from buildgrid._protos.google.longrunning import operations_pb2
    
    25 26
     
    
    26 27
     
    
    27 28
     class OperationsInstance:
    
    ... ... @@ -34,18 +35,21 @@ class OperationsInstance:
    34 35
             server.add_operations_instance(self, instance_name)
    
    35 36
     
    
    36 37
         def get_operation(self, name):
    
    37
    -        operation = self._scheduler.jobs.get(name)
    
    38
    +        job = self._scheduler.jobs.get(name)
    
    38 39
     
    
    39
    -        if operation is None:
    
    40
    +        if job is None:
    
    40 41
                 raise InvalidArgumentError("Operation name does not exist: [{}]".format(name))
    
    41 42
     
    
    42 43
             else:
    
    43
    -            return operation.get_operation()
    
    44
    +            return job.operation
    
    44 45
     
    
    45 46
         def list_operations(self, list_filter, page_size, page_token):
    
    46 47
             # TODO: Pages
    
    47 48
             # Spec says number of pages and length of a page are optional
    
    48
    -        return self._scheduler.get_operations()
    
    49
    +        response = operations_pb2.ListOperationsResponse()
    
    50
    +        response.operations.extend([job.operation for job in self._scheduler.list_jobs()])
    
    51
    +
    
    52
    +        return response
    
    49 53
     
    
    50 54
         def delete_operation(self, name):
    
    51 55
             try:
    

  • buildgrid/server/scheduler.py
    ... ... @@ -11,9 +11,7 @@
    11 11
     # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    
    12 12
     # See the License for the specific language governing permissions and
    
    13 13
     # limitations under the License.
    
    14
    -#
    
    15
    -# Authors:
    
    16
    -#        Finn Ball <finn ball codethink co uk>
    
    14
    +
    
    17 15
     
    
    18 16
     """
    
    19 17
     Scheduler
    
    ... ... @@ -24,10 +22,8 @@ Schedules jobs.
    24 22
     from collections import deque
    
    25 23
     
    
    26 24
     from buildgrid._exceptions import NotFoundError
    
    27
    -from buildgrid._protos.build.bazel.remote.execution.v2 import remote_execution_pb2
    
    28
    -from buildgrid._protos.google.longrunning import operations_pb2
    
    29 25
     
    
    30
    -from .job import ExecuteStage, LeaseState
    
    26
    +from .job import OperationStage, LeaseState
    
    31 27
     
    
    32 28
     
    
    33 29
     class Scheduler:
    
    ... ... @@ -39,80 +35,96 @@ class Scheduler:
    39 35
             self.jobs = {}
    
    40 36
             self.queue = deque()
    
    41 37
     
    
    42
    -    def register_client(self, name, queue):
    
    43
    -        self.jobs[name].register_client(queue)
    
    38
    +    def register_client(self, job_name, queue):
    
    39
    +        self.jobs[job_name].register_client(queue)
    
    40
    +
    
    41
    +    def unregister_client(self, job_name, queue):
    
    42
    +        self.jobs[job_name].unregister_client(queue)
    
    44 43
     
    
    45
    -    def unregister_client(self, name, queue):
    
    46
    -        job = self.jobs[name]
    
    47
    -        job.unregister_client(queue)
    
    48
    -        if job.check_job_finished():
    
    49
    -            del self.jobs[name]
    
    44
    +        if not self.jobs[job_name].n_clients and self.jobs[job_name].operation.done:
    
    45
    +            del self.jobs[job_name]
    
    50 46
     
    
    51
    -    def append_job(self, job, skip_cache_lookup=False):
    
    47
    +    def queue_job(self, job, skip_cache_lookup=False):
    
    52 48
             self.jobs[job.name] = job
    
    49
    +
    
    50
    +        operation_stage = None
    
    53 51
             if self._action_cache is not None and not skip_cache_lookup:
    
    54 52
                 try:
    
    55
    -                cached_result = self._action_cache.get_action_result(job.action_digest)
    
    53
    +                action_result = self._action_cache.get_action_result(job.action_digest)
    
    56 54
                 except NotFoundError:
    
    55
    +                operation_stage = OperationStage.QUEUED
    
    57 56
                     self.queue.append(job)
    
    58
    -                job.update_execute_stage(ExecuteStage.QUEUED)
    
    59 57
     
    
    60 58
                 else:
    
    61
    -                job.result = cached_result
    
    62
    -                job.result_cached = True
    
    63
    -                job.update_execute_stage(ExecuteStage.COMPLETED)
    
    59
    +                job.set_cached_result(action_result)
    
    60
    +                operation_stage = OperationStage.COMPLETED
    
    64 61
     
    
    65 62
             else:
    
    63
    +            operation_stage = OperationStage.QUEUED
    
    66 64
                 self.queue.append(job)
    
    67
    -            job.update_execute_stage(ExecuteStage.QUEUED)
    
    68 65
     
    
    69
    -    def retry_job(self, name):
    
    70
    -        if name in self.jobs:
    
    71
    -            job = self.jobs[name]
    
    66
    +        job.update_operation_stage(operation_stage)
    
    67
    +
    
    68
    +    def retry_job(self, job_name):
    
    69
    +        if job_name in self.jobs:
    
    70
    +            job = self.jobs[job_name]
    
    72 71
                 if job.n_tries >= self.MAX_N_TRIES:
    
    73 72
                     # TODO: Decide what to do with these jobs
    
    74
    -                job.update_execute_stage(ExecuteStage.COMPLETED)
    
    73
    +                job.update_operation_stage(OperationStage.COMPLETED)
    
    75 74
                     # TODO: Mark these jobs as done
    
    76 75
                 else:
    
    77
    -                job.update_execute_stage(ExecuteStage.QUEUED)
    
    78
    -                job.n_tries += 1
    
    76
    +                job.update_operation_stage(OperationStage.QUEUED)
    
    79 77
                     self.queue.appendleft(job)
    
    80 78
     
    
    81
    -    def job_complete(self, name, result, status):
    
    82
    -        job = self.jobs[name]
    
    83
    -        job.lease.status.CopyFrom(status)
    
    84
    -        action_result = remote_execution_pb2.ActionResult()
    
    85
    -        result.Unpack(action_result)
    
    86
    -        job.result = action_result
    
    87
    -        if not job.do_not_cache and self._action_cache is not None:
    
    88
    -            if not job.lease.status.code:
    
    89
    -                self._action_cache.update_action_result(job.action_digest, action_result)
    
    90
    -        job.update_execute_stage(ExecuteStage.COMPLETED)
    
    91
    -
    
    92
    -    def get_operations(self):
    
    93
    -        response = operations_pb2.ListOperationsResponse()
    
    94
    -        for v in self.jobs.values():
    
    95
    -            response.operations.extend([v.get_operation()])
    
    96
    -        return response
    
    97
    -
    
    98
    -    def update_job_lease_state(self, name, state):
    
    99
    -        job = self.jobs[name]
    
    100
    -        job.lease.state = state
    
    101
    -
    
    102
    -    def get_job_lease(self, name):
    
    103
    -        return self.jobs[name].lease
    
    104
    -
    
    105
    -    def cancel_session(self, name):
    
    106
    -        job = self.jobs[name]
    
    107
    -        state = job.lease.state
    
    108
    -        if state in (LeaseState.PENDING.value, LeaseState.ACTIVE.value):
    
    109
    -            self.retry_job(name)
    
    110
    -
    
    111
    -    def create_lease(self):
    
    112
    -        if self.queue:
    
    113
    -            job = self.queue.popleft()
    
    114
    -            job.update_execute_stage(ExecuteStage.EXECUTING)
    
    115
    -            job.create_lease()
    
    116
    -            job.lease.state = LeaseState.PENDING.value
    
    117
    -            return job.lease
    
    118
    -        return None
    79
    +    def list_jobs(self):
    
    80
    +        return self.jobs.values()
    
    81
    +
    
    82
    +    def request_job_leases(self, worker_capabilities):
    
    83
    +        """Generates a list of the highest priority leases to be run.
    
    84
    +
    
    85
    +        Args:
    
    86
    +            worker_capabilities (dict): a set of key-value pairs decribing the
    
    87
    +                worker properties, configuration and state at the time of the
    
    88
    +                request.
    
    89
    +        """
    
    90
    +        if not self.queue:
    
    91
    +            return []
    
    92
    +
    
    93
    +        job = self.queue.popleft()
    
    94
    +        # For now, one lease at a time:
    
    95
    +        lease = job.create_lease()
    
    96
    +
    
    97
    +        return [lease]
    
    98
    +
    
    99
    +    def update_job_lease_state(self, job_name, lease_state, lease_status=None, lease_result=None):
    
    100
    +        """Requests a state transition for a job's current :class:Lease.
    
    101
    +
    
    102
    +        Args:
    
    103
    +            job_name (str): name of the job to query.
    
    104
    +            lease_state (LeaseState): the lease state to transition to.
    
    105
    +            lease_status (google.rpc.Status): the lease execution status, only
    
    106
    +                required if `lease_state` is `COMPLETED`.
    
    107
    +            lease_result (google.protobuf.Any): the lease execution result, only
    
    108
    +                required if `lease_state` is `COMPLETED`.
    
    109
    +        """
    
    110
    +        job = self.jobs[job_name]
    
    111
    +
    
    112
    +        if lease_state != LeaseState.COMPLETED:
    
    113
    +            job.update_lease_state(lease_state)
    
    114
    +
    
    115
    +        else:
    
    116
    +            job.update_lease_state(lease_state,
    
    117
    +                                   status=lease_status, result=lease_result)
    
    118
    +
    
    119
    +            if self._action_cache is not None and not job.do_not_cache:
    
    120
    +                self._action_cache.update_action_result(job.action_digest, job.action_result)
    
    121
    +
    
    122
    +            job.update_operation_stage(OperationStage.COMPLETED)
    
    123
    +
    
    124
    +    def get_job_lease(self, job_name):
    
    125
    +        """Returns the lease associated to job, if any have been emitted yet."""
    
    126
    +        return self.jobs[job_name].lease
    
    127
    +
    
    128
    +    def get_job_operation(self, job_name):
    
    129
    +        """Returns the operation associated to job."""
    
    130
    +        return self.jobs[job_name].operation

  • docs/source/index.rst
    ... ... @@ -19,6 +19,7 @@ Remote execution service implementing Google's REAPI and RWAPI.
    19 19
        using.rst
    
    20 20
        reference.rst
    
    21 21
        contributing.rst
    
    22
    +   resources.rst
    
    22 23
     
    
    23 24
     
    
    24 25
     Resources
    
    ... ... @@ -28,11 +29,11 @@ Resources
    28 29
     - `GitLab repository`_
    
    29 30
     - `Bug tracking`_
    
    30 31
     - `Mailing list`_
    
    31
    -- `Slack channel`_  [`invite link`_]
    
    32
    +- `Slack channel`_ [`invite link`_]
    
    32 33
     
    
    33 34
     .. _Homepage: https://buildgrid.build
    
    34 35
     .. _GitLab repository: https://gitlab.com/BuildGrid/buildgrid
    
    35
    -.. _Bug tracking: https://gitlab.com/BuildGrid/buildgrid/issues
    
    36
    +.. _Bug tracking: https://gitlab.com/BuildGrid/buildgrid/boards
    
    36 37
     .. _Mailing list: https://lists.buildgrid.build/cgi-bin/mailman/listinfo/buildgrid
    
    37 38
     .. _Slack channel: https://buildteamworld.slack.com/messages/CC9MKC203
    
    38
    -.. _invite link: https://join.slack.com/t/buildteamworld/shared_invite/enQtMzkxNzE0MDMyMDY1LTRmZmM1OWE0OTFkMGE1YjU5Njc4ODEzYjc0MGMyOTM5ZTQ5MmE2YTQ1MzQwZDc5MWNhODY1ZmRkZTE4YjFhNjU
    39
    +.. _invite link: https://join.slack.com/t/buildteamworld/shared_invite/enQtMzkxNzE0MDMyMDY1LTRmZmM1OWE0OTFkMGE1YjU5Njc4ODEzYjc0MGMyOTM5ZTQ5MmE2YTQ1MzQwZDc5MWNhODY1ZmRkZTE4YjFhNjU
    \ No newline at end of file

  • docs/source/installation.rst
    1
    -
    
    2 1
     .. _installation:
    
    3 2
     
    
    4 3
     Installation
    
    5 4
     ============
    
    6 5
     
    
    7
    -How to install BuildGrid onto your machine.
    
    6
    +.. _install-on-host:
    
    7
    +
    
    8
    +Installation onto host machine
    
    9
    +------------------------------
    
    10
    +
    
    11
    +How to install BuildGrid directly onto your machine.
    
    8 12
     
    
    9 13
     .. note::
    
    10 14
     
    
    11
    -   BuildGrid server currently only support *Linux*, *macOS* and *Windows*
    
    15
    +   BuildGrid server currently only support *Linux*. *macOS* and *Windows*
    
    12 16
        platforms are **not** supported.
    
    13 17
     
    
    14 18
     
    
    15
    -.. _install-prerequisites:
    
    19
    +.. _install-host-prerequisites:
    
    16 20
     
    
    17 21
     Prerequisites
    
    18
    --------------
    
    22
    +~~~~~~~~~~~~~
    
    19 23
     
    
    20 24
     BuildGrid only supports ``python3 >= 3.5`` but has no system requirements. Main
    
    21
    -Python dependencies, automatically handle during installation, includes:
    
    25
    +Python dependencies, automatically handled during installation, include:
    
    22 26
     
    
    23 27
     - `boto3`_: the Amazon Web Services (AWS) SDK for Python.
    
    24 28
     - `click`_: a Python composable command line library.
    
    ... ... @@ -33,10 +37,10 @@ Python dependencies, automatically handle during installation, includes:
    33 37
     .. _protocol-buffers: https://developers.google.com/protocol-buffers
    
    34 38
     
    
    35 39
     
    
    36
    -.. _source-install:
    
    40
    +.. _install-host-source-install:
    
    37 41
     
    
    38 42
     Install from sources
    
    39
    ---------------------
    
    43
    +~~~~~~~~~~~~~~~~~~~~
    
    40 44
     
    
    41 45
     BuildGrid has ``setuptools`` support. In order to install it to your home
    
    42 46
     directory, typically under ``~/.local``, simply run:
    
    ... ... @@ -46,7 +50,7 @@ directory, typically under ``~/.local``, simply run:
    46 50
        git clone https://gitlab.com/BuildGrid/buildgrid.git && cd buildgrid
    
    47 51
        pip3 install --user --editable .
    
    48 52
     
    
    49
    -Additionally, and if your distribution does not already includes it, you may
    
    53
    +Additionally, and if your distribution does not already include it, you may
    
    50 54
     have to adjust your ``PATH``, in ``~/.bashrc``, with:
    
    51 55
     
    
    52 56
     .. code-block:: sh
    
    ... ... @@ -63,3 +67,62 @@ have to adjust your ``PATH``, in ``~/.bashrc``, with:
    63 67
        .. code-block:: sh
    
    64 68
     
    
    65 69
           pip3 install --user --editable ".[docs,tests]"
    
    70
    +
    
    71
    +
    
    72
    +
    
    73
    +.. install-docker:
    
    74
    +
    
    75
    +Installation through docker
    
    76
    +---------------------------
    
    77
    +
    
    78
    +How to build a Docker image that runs BuildGrid.
    
    79
    +
    
    80
    +.. _install-docker-prerequisites:
    
    81
    +
    
    82
    +Prerequisites
    
    83
    +~~~~~~~~~~~~~
    
    84
    +
    
    85
    +A working Docker installation. Please consult `Docker's Getting Started Guide`_ if you don't already have it installed.
    
    86
    +
    
    87
    +.. _`Docker's Getting Started Guide`: https://www.docker.com/get-started
    
    88
    +
    
    89
    +
    
    90
    +.. _install-docker-build:
    
    91
    +
    
    92
    +Docker Container from Sources
    
    93
    +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    94
    +
    
    95
    +To clone the source code and build a Docker image, simply run:
    
    96
    +
    
    97
    +.. code-block:: sh
    
    98
    +
    
    99
    +   git clone https://gitlab.com/BuildGrid/buildgrid.git && cd buildgrid
    
    100
    +   docker build -t buildgrid_server .
    
    101
    +
    
    102
    +.. note::
    
    103
    +
    
    104
    +   The image built will contain the contents of the source code directory, including
    
    105
    +   configuration files.
    
    106
    +   
    
    107
    +.. hint::
    
    108
    +
    
    109
    +    Whenever the source code is updated or new configuration files are made, you need to re-build 
    
    110
    +    the image.
    
    111
    +
    
    112
    +After building the Docker image, to run BuildGrid using the default configuration file 
    
    113
    +(found in `buildgrid/_app/settings/default.yml`), simply run:
    
    114
    +
    
    115
    +.. code-block:: sh
    
    116
    +
    
    117
    +   docker run -i -p 50051:50051 buildgrid_server
    
    118
    +
    
    119
    +.. note::
    
    120
    +
    
    121
    +    To run BuildGrid using a different configuration file, include the relative path to the
    
    122
    +    configuration file at the end of the command above. For example, to run the default 
    
    123
    +    standalone CAS server (without an execution service), simply run:
    
    124
    +
    
    125
    +       .. code-block:: sh
    
    126
    +
    
    127
    +            docker run -i -p 50052:50052 buildgrid_server buildgrid/_app/settings/cas.yml
    
    128
    +

  • docs/source/resources.rst
    1
    +.. _external-resources:
    
    2
    +
    
    3
    +Resources
    
    4
    +=========
    
    5
    +
    
    6
    +Remote execution and worker API useful links:
    
    7
    +
    
    8
    +- `REAPI design document`_
    
    9
    +- `REAPI protobuf specification`_
    
    10
    +- `RWAPI design document`_
    
    11
    +- `RWAPI protobuf specification`_
    
    12
    +- `Bazel`_ `remote caching and execution documentation`_
    
    13
    +- `RECC usage instructions`_
    
    14
    +- `BuildStream webside`_ and `documentation`_
    
    15
    +- `BuildStream-externals repository`_
    
    16
    +- `BuildBox repository`_
    
    17
    +- `FUSE on wikipedia`_ and `kernel documentation`_
    
    18
    +- `bubblewrap repository`_
    
    19
    +- `Buildfarm reference REAPI implementation`_
    
    20
    +- `Buildbarn Golang REAPI implementation`_
    
    21
    +- `Demonstration of RECC with BuildGrid`_
    
    22
    +- `Demonstration of Bazel with BuildGrid`_
    
    23
    +- `Demonstration of BuildStream with BuildGrid`_
    
    24
    +
    
    25
    +.. _REAPI design document: https://docs.google.com/document/d/1AaGk7fOPByEvpAbqeXIyE8HX_A3_axxNnvroblTZ_6s
    
    26
    +.. _REAPI protobuf specification: https://github.com/bazelbuild/remote-apis/blob/master/build/bazel/remote/execution/v2/remote_execution.proto
    
    27
    +.. _RWAPI design document: https://docs.google.com/document/d/1s_AzRRD2mdyktKUj2HWBn99rMg_3tcPvdjx3MPbFidU
    
    28
    +.. _RWAPI protobuf specification: https://github.com/googleapis/googleapis/blob/master/google/devtools/remoteworkers/v1test2/bots.proto
    
    29
    +.. _Bazel: https://www.bazel.build
    
    30
    +.. _remote caching and execution documentation: https://docs.bazel.build/versions/master/remote-caching.html
    
    31
    +.. _RECC usage instructions: https://gitlab.com/bloomberg/recc#running-recc
    
    32
    +.. _BuildStream webside: https://buildstream.build
    
    33
    +.. _documentation: https://docs.buildstream.build
    
    34
    +.. _BuildStream-externals repository: https://gitlab.com/BuildStream/bst-external
    
    35
    +.. _FUSE on wikipedia: https://en.wikipedia.org/wiki/Filesystem_in_Userspace
    
    36
    +.. _kernel documentation: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/fuse.txt
    
    37
    +.. _BuildBox repository: https://gitlab.com/BuildStream/buildbox
    
    38
    +.. _bubblewrap repository: https://github.com/projectatomic/bubblewrap
    
    39
    +.. _Buildfarm reference REAPI implementation: https://github.com/bazelbuild/bazel-buildfarm
    
    40
    +.. _Buildbarn Golang REAPI implementation: https://github.com/EdSchouten/bazel-buildbarn
    
    41
    +.. _Demonstration of RECC with BuildGrid: https://asciinema.org/a/0FjExIqrTGSlpSUIS8Ehf5gUg
    
    42
    +.. _Demonstration of Bazel with BuildGrid: https://asciinema.org/a/uVHFWOxpivwJ4ari23CEerR8N
    
    43
    +.. _Demonstration of BuildStream with BuildGrid: https://asciinema.org/a/QfkYGqhfhEQz4o8prlBdEBFP7

  • setup.cfg
    ... ... @@ -14,3 +14,6 @@ pep8ignore =
    14 14
         docs/source/conf.py ALL
    
    15 15
         *_pb2.py ALL
    
    16 16
         *_pb2_grpc.py ALL
    
    17
    +filterwarnings =
    
    18
    +    ignore::DeprecationWarning
    
    19
    +    ignore::PendingDeprecationWarning
    \ No newline at end of file

  • setup.py
    ... ... @@ -90,7 +90,7 @@ tests_require = [
    90 90
         'moto',
    
    91 91
         'pep8',
    
    92 92
         'psutil',
    
    93
    -    'pytest == 3.6.4',
    
    93
    +    'pytest >= 3.8.0',
    
    94 94
         'pytest-cov >= 2.6.0',
    
    95 95
         'pytest-pep8',
    
    96 96
         'pytest-pylint',
    

  • tests/integration/bots_service.py
    ... ... @@ -137,7 +137,7 @@ def test_update_leases_with_work(bot_session, context, instance):
    137 137
                                                    bot_session=bot_session)
    
    138 138
     
    
    139 139
         action_digest = remote_execution_pb2.Digest(hash='gaff')
    
    140
    -    _inject_work(instance._instances[""]._scheduler, action_digest)
    
    140
    +    _inject_work(instance._instances[""]._scheduler, action_digest=action_digest)
    
    141 141
     
    
    142 142
         response = instance.CreateBotSession(request, context)
    
    143 143
     
    
    ... ... @@ -159,7 +159,7 @@ def test_update_leases_work_complete(bot_session, context, instance):
    159 159
     
    
    160 160
         # Inject work
    
    161 161
         action_digest = remote_execution_pb2.Digest(hash='gaff')
    
    162
    -    _inject_work(instance._instances[""]._scheduler, action_digest)
    
    162
    +    _inject_work(instance._instances[""]._scheduler, action_digest=action_digest)
    
    163 163
     
    
    164 164
         request = bots_pb2.UpdateBotSessionRequest(name=response.name,
    
    165 165
                                                    bot_session=response)
    
    ... ... @@ -174,6 +174,7 @@ def test_update_leases_work_complete(bot_session, context, instance):
    174 174
         response = copy.deepcopy(instance.UpdateBotSession(request, context))
    
    175 175
     
    
    176 176
         response.leases[0].state = LeaseState.COMPLETED.value
    
    177
    +    response.leases[0].result.Pack(remote_execution_pb2.ActionResult())
    
    177 178
     
    
    178 179
         request = bots_pb2.UpdateBotSessionRequest(name=response.name,
    
    179 180
                                                    bot_session=response)
    
    ... ... @@ -187,7 +188,7 @@ def test_work_rejected_by_bot(bot_session, context, instance):
    187 188
                                                    bot_session=bot_session)
    
    188 189
         # Inject work
    
    189 190
         action_digest = remote_execution_pb2.Digest(hash='gaff')
    
    190
    -    _inject_work(instance._instances[""]._scheduler, action_digest)
    
    191
    +    _inject_work(instance._instances[""]._scheduler, action_digest=action_digest)
    
    191 192
     
    
    192 193
         # Simulated the severed binding between client and server
    
    193 194
         response = copy.deepcopy(instance.CreateBotSession(request, context))
    
    ... ... @@ -209,7 +210,7 @@ def test_work_out_of_sync_from_pending(state, bot_session, context, instance):
    209 210
                                                    bot_session=bot_session)
    
    210 211
         # Inject work
    
    211 212
         action_digest = remote_execution_pb2.Digest(hash='gaff')
    
    212
    -    _inject_work(instance._instances[""]._scheduler, action_digest)
    
    213
    +    _inject_work(instance._instances[""]._scheduler, action_digest=action_digest)
    
    213 214
     
    
    214 215
         # Simulated the severed binding between client and server
    
    215 216
         response = copy.deepcopy(instance.CreateBotSession(request, context))
    
    ... ... @@ -230,7 +231,7 @@ def test_work_out_of_sync_from_active(state, bot_session, context, instance):
    230 231
                                                    bot_session=bot_session)
    
    231 232
         # Inject work
    
    232 233
         action_digest = remote_execution_pb2.Digest(hash='gaff')
    
    233
    -    _inject_work(instance._instances[""]._scheduler, action_digest)
    
    234
    +    _inject_work(instance._instances[""]._scheduler, action_digest=action_digest)
    
    234 235
     
    
    235 236
         # Simulated the severed binding between client and server
    
    236 237
         response = copy.deepcopy(instance.CreateBotSession(request, context))
    
    ... ... @@ -257,7 +258,7 @@ def test_work_active_to_active(bot_session, context, instance):
    257 258
                                                    bot_session=bot_session)
    
    258 259
         # Inject work
    
    259 260
         action_digest = remote_execution_pb2.Digest(hash='gaff')
    
    260
    -    _inject_work(instance._instances[""]._scheduler, action_digest)
    
    261
    +    _inject_work(instance._instances[""]._scheduler, action_digest=action_digest)
    
    261 262
     
    
    262 263
         # Simulated the severed binding between client and server
    
    263 264
         response = copy.deepcopy(instance.CreateBotSession(request, context))
    
    ... ... @@ -279,8 +280,10 @@ def test_post_bot_event_temp(context, instance):
    279 280
         context.set_code.assert_called_once_with(grpc.StatusCode.UNIMPLEMENTED)
    
    280 281
     
    
    281 282
     
    
    282
    -def _inject_work(scheduler, action_digest=None):
    
    283
    +def _inject_work(scheduler, action=None, action_digest=None):
    
    284
    +    if not action:
    
    285
    +        action = remote_execution_pb2.Action()
    
    283 286
         if not action_digest:
    
    284 287
             action_digest = remote_execution_pb2.Digest()
    
    285
    -    j = job.Job(action_digest, False)
    
    286
    -    scheduler.append_job(j, True)
    288
    +    j = job.Job(action, action_digest)
    
    289
    +    scheduler.queue_job(j, True)

  • tests/integration/execution_service.py
    ... ... @@ -82,7 +82,7 @@ def test_execute(skip_cache_lookup, instance, context):
    82 82
         assert isinstance(result, operations_pb2.Operation)
    
    83 83
         metadata = remote_execution_pb2.ExecuteOperationMetadata()
    
    84 84
         result.metadata.Unpack(metadata)
    
    85
    -    assert metadata.stage == job.ExecuteStage.QUEUED.value
    
    85
    +    assert metadata.stage == job.OperationStage.QUEUED.value
    
    86 86
         assert uuid.UUID(result.name, version=4)
    
    87 87
         assert result.done is False
    
    88 88
     
    
    ... ... @@ -105,7 +105,7 @@ def test_no_action_digest_in_storage(instance, context):
    105 105
     
    
    106 106
     
    
    107 107
     def test_wait_execution(instance, controller, context):
    
    108
    -    j = job.Job(action_digest, None)
    
    108
    +    j = job.Job(action, action_digest)
    
    109 109
         j._operation.done = True
    
    110 110
     
    
    111 111
         request = remote_execution_pb2.WaitExecutionRequest(name="{}/{}".format('', j.name))
    
    ... ... @@ -116,7 +116,7 @@ def test_wait_execution(instance, controller, context):
    116 116
         action_result = remote_execution_pb2.ActionResult()
    
    117 117
         action_result_any.Pack(action_result)
    
    118 118
     
    
    119
    -    j.update_execute_stage(job.ExecuteStage.COMPLETED)
    
    119
    +    j.update_operation_stage(job.OperationStage.COMPLETED)
    
    120 120
     
    
    121 121
         response = instance.WaitExecution(request, context)
    
    122 122
     
    
    ... ... @@ -125,7 +125,7 @@ def test_wait_execution(instance, controller, context):
    125 125
         assert isinstance(result, operations_pb2.Operation)
    
    126 126
         metadata = remote_execution_pb2.ExecuteOperationMetadata()
    
    127 127
         result.metadata.Unpack(metadata)
    
    128
    -    assert metadata.stage == job.ExecuteStage.COMPLETED.value
    
    128
    +    assert metadata.stage == job.OperationStage.COMPLETED.value
    
    129 129
         assert uuid.UUID(result.name, version=4)
    
    130 130
         assert result.done is True
    
    131 131
     
    

  • tests/integration/operations_service.py
    ... ... @@ -30,6 +30,7 @@ from buildgrid._protos.google.longrunning import operations_pb2
    30 30
     from buildgrid._protos.google.rpc import status_pb2
    
    31 31
     from buildgrid.server.cas.storage import lru_memory_cache
    
    32 32
     from buildgrid.server.controller import ExecutionController
    
    33
    +from buildgrid.server.job import LeaseState
    
    33 34
     from buildgrid.server.operations import service
    
    34 35
     from buildgrid.server.operations.service import OperationsService
    
    35 36
     from buildgrid.utils import create_digest
    
    ... ... @@ -131,9 +132,10 @@ def test_list_operations_with_result(instance, controller, execute_request, cont
    131 132
         action_result.output_files.extend([output_file])
    
    132 133
     
    
    133 134
         controller.operations_instance._scheduler.jobs[response_execute.name].create_lease()
    
    134
    -    controller.operations_instance._scheduler.job_complete(response_execute.name,
    
    135
    -                                                           _pack_any(action_result),
    
    136
    -                                                           status_pb2.Status())
    
    135
    +    controller.operations_instance._scheduler.update_job_lease_state(response_execute.name,
    
    136
    +                                                                     LeaseState.COMPLETED,
    
    137
    +                                                                     lease_status=status_pb2.Status(),
    
    138
    +                                                                     lease_result=_pack_any(action_result))
    
    137 139
     
    
    138 140
         request = operations_pb2.ListOperationsRequest(name=instance_name)
    
    139 141
         response = instance.ListOperations(request, context)
    



  • [Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]