Improving staging performance
- From: Angelos Evripiotis <angelos evripiotis gmail com>
- To: buildstream-list gnome org
- Subject: Improving staging performance
- Date: Wed, 15 Nov 2017 13:20:25 +0000
Hello!
In this message I hope to present a convincing case that staging delay is
significant for a reasonable amount of files, and that for rebuilds it can be
drastically reduced.
I ran an experiment locally with an alpine sysroot and 100k files. I think this
is a reasonable amount of files because it's roughly the same number I work
with on another project, which is DPKG-based.
Similar to the results in previous posts on performance, I see that staging
takes around 20 seconds. This is a significant speed bump in the workflow.
I found that with a trivial hack, it's possible to cache the results of
staging. This means that for rebuilds, or builds that use the same
dependencies, we don't pay the cost of linking files into the sandbox. With the
hack applied, the trivial re-build feels near-instant.
I think a lot of work would need to be done to make the approach
production-ready.
My hope is that this message will start a conversation to explore the
possibilities for speeding up staging :)
Cheers,
Angelos
P.S.
In trying to address the same problem in Bazel, some engineers have created
'sandboxfs' for avoiding the burden of linking into the sandbox. Instead it
creates a FUSE fs that maps as desired.
https://blog.bazel.build/2017/08/25/introducing-sandboxfs.html
Separately I'm going to continue the work of measuring BuildStream performance
in general, as discussed on the 'Benchmarking BuildStream commands from user
point of view' thread.
P.P.S.
Here's my full repro steps and local patch, for reference.
Steps to setup the test:
```
wget https://nl.alpinelinux.org/alpine/v3.6/releases/x86_64/alpine-minirootfs-3.6.1-x86_64.tar.gz
mkdir -p sysroot
tar -x -f alpine-minirootfs-3.6.1-x86_64.tar.gz -C sysroot --exclude=./dev
tar -z -c -f alpine-3.6.1-x86_64.tar.gz -C sysroot .
rm -rf sysroot
mkdir manyfiles
cd manyfiles
mkdir -p {1..100}/{1..100}
for name in {1..100}/{1..100}/{1..10}; do echo $name > $name; done
cd ..
tar czf manyfiles.tar.gz manyfiles/
rm -rf manyfiles/
mkdir bst-alpine
cat > bst-alpine/alpine.bst <<EOF
kind: import
description: Import Alpine Linux.
sources:
- kind: tar
url: file://$(pwd)/alpine-3.6.1-x86_64.tar.gz
EOF
cat > bst-alpine/manyfiles.bst <<EOF
kind: import
description: Import many files.
sources:
- kind: tar
url: file://$(pwd)/manyfiles.tar.gz
EOF
cat > bst-alpine/test.bst <<EOF
kind: manual
description: Trivially exercise a manual step.
depends:
- filename: manyfiles.bst
type: build
- filename: alpine.bst
type: build
config:
install-commands:
- touch /buildstream/install/1
EOF
cd bst-alpine
echo 'name: bst-alpine' > project.conf
bst track alpine.bst
bst track manyfiles.bst
```
```diff
diff --git a/buildstream/element.py b/buildstream/element.py
index b76ef68..567e1e8 100644
--- a/buildstream/element.py
+++ b/buildstream/element.py
@@ -1057,6 +1057,21 @@ class Element(Plugin):
def cleanup_rootdir():
utils._force_rmtree(rootdir)
+ cache_key = utils._generate_key({
+ 'dependencies': [
+ e._get_cache_key_from_artifact()
+ for e in self.dependencies(Scope.BUILD)
+ ],
+ })
+
+ cache_path = os.path.join(context.builddir,
'staging-cache-' + cache_key)
+ using_cache = False
+ if os.path.exists(cache_path):
+ using_cache = True
+ else:
+ os.makedirs(cache_path, exist_ok=True)
+ rootdir = cache_path
+
with _signals.terminator(cleanup_rootdir), \
self.__sandbox(rootdir, output_file, output_file) as
sandbox: # nopep8
@@ -1070,8 +1085,14 @@ class Element(Plugin):
try:
# Step 1 - Configure
self.configure_sandbox(sandbox)
- # Step 2 - Stage
- self.stage(sandbox)
+
+ if not using_cache:
+ # Step 2 - Stage
+ self.debug('Using root: {}'.format(rootdir))
+ self.stage(sandbox)
+ else:
+ self.debug('Using cache: {}'.format(cache_path))
+
# Step 3 - Assemble
collect = self.assemble(sandbox)
except BstError as e:
@@ -1132,7 +1153,11 @@ class Element(Plugin):
self.__artifacts.commit(self, assembledir)
# Finally cleanup the build dir
- cleanup_rootdir()
+ # cleanup_rootdir()
+
+ utils._force_rmtree(os.path.join(rootdir, 'artifact'))
+ utils._force_rmtree(os.path.join(rootdir, 'scratch'))
+ utils._force_rmtree(os.path.join(rootdir, 'root', 'buildstream'))
# _pull():
#
```
Build without staging cache patch applied (27 secs):
```
time bst --debug build test.bst
DEBUG: Early enablement of messages
Loading: 003
[pid:1377 id:001][--:--:--][ ][ main:test.bst
] DEBUG Created: manual element at test.bst [line 1 column 0]
[pid:1377 id:002][--:--:--][ ][ main:alpine.bst
] DEBUG Created: import element at alpine.bst [line 1 column 0]
[pid:1377 id:003][--:--:--][ ][ main:alpine.bst-0
] DEBUG Created: tar source at alpine.bst [line 4 column 2]
[pid:1377 id:004][--:--:--][ ][ main:manyfiles.bst
] DEBUG Created: import element at manyfiles.bst [line 1 column
0]
[pid:1377 id:005][--:--:--][ ][ main:manyfiles.bst-0
] DEBUG Created: tar source at manyfiles.bst [line 4 column 2]
Checking: 003/003
BuildStream Version 0.1.dev1451+gde35793.d20171110
Session Start: Wednesday, 15-11-2017 at 09:55:14
Project: alpine (/test/alpine-bst)
Targets: test.bst
User Configuration
Configuration File: Default Configuration
Log Files: /tmp/angelos/.cache/buildstream/logs
Source Mirrors: /tmp/angelos/.cache/buildstream/sources
Build Area: /tmp/angelos/.cache/buildstream/build
Artifact Cache: /tmp/angelos/.cache/buildstream/artifacts
Maximum Fetch Tasks: 10
Maximum Build Tasks: 4
Maximum Push Tasks: 4
Maximum Network Retries: 2
Pipeline
cached d92feae0 alpine.bst
cached f255b0f8 manyfiles.bst
buildable c8fc78c0 test.bst
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[pid:1377 id:000][--:--:--] START Starting build
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] START
alpine/test/c8fc78c0-build.1384.log
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] START
Staging dependencies
[pid:1384 id:001][00:00:22][c8fc78c0][build:test.bst ] SUCCESS
Staging dependencies
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] START
Integrating sandbox
[pid:1384 id:001][00:00:00][c8fc78c0][build:test.bst ] SUCCESS
Integrating sandbox
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] START
Staging sources
[pid:1384 id:001][00:00:00][c8fc78c0][build:test.bst ] SUCCESS
Staging sources
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] START
Running install-commands
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] STATUS
Running install-commands
touch /buildstream/install/13
[pid:1384 id:001][00:00:00][c8fc78c0][build:test.bst ] SUCCESS
Running install-commands
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] START
Running strip-commands
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] STATUS
Running strip-commands
find "/buildstream/install" -type f \
'(' -perm -111 -o -name '*.so*' \
-o -name '*.cmxs' -o -name '*.node' ')' \
-exec sh -ec \
'read -n4 hdr <"$1" # check for elf header
if [ "$hdr" != "$(printf \\x7fELF)" ]; then
exit 0
fi
debugfile="/buildstream/install/usr/lib/debug/$(basename "$1")"
mkdir -p "$(dirname "$debugfile")"
objcopy --only-keep-debug --compress-debug-sections "$1" "$debugfile"
chmod 644 "$debugfile"
strip --remove-section=.comment --remove-section=.note
--strip-unneeded "$1"
objcopy --add-gnu-debuglink "$debugfile" "$1"' - {} ';'
[pid:1384 id:001][00:00:00][c8fc78c0][build:test.bst ] SUCCESS
Running strip-commands
[pid:1384 id:001][--:--:--][c8fc78c0][build:test.bst ] START
Caching Artifact
[pid:1384 id:001][00:00:00][c8fc78c0][build:test.bst ] SUCCESS
Caching Artifact
[pid:1384 id:001][00:00:26][c8fc78c0][build:test.bst ] SUCCESS
alpine/test/c8fc78c0-build.1384.log
[pid:1377 id:000][00:00:26] SUCCESS Build Complete
Pipeline Summary
Total: 3
Session: 1
Fetch Queue: processed 0, skipped 1, failed 0
Build Queue: processed 1, skipped 0, failed 0
DEBUG: Destroyed: manual element at test.bst [line 1 column 0]
DEBUG: Destroyed: import element at alpine.bst [line 1 column 0]
DEBUG: Destroyed: import element at manyfiles.bst [line 1 column 0]
DEBUG: Destroyed: tar source at alpine.bst [line 4 column 2]
DEBUG: Destroyed: tar source at manyfiles.bst [line 4 column 2]
real 0m27.307s
user 0m13.430s
sys 0m12.360s
```
Priming staging cache with initial build of test.bst (25 secs):
```
time bst --debug build test.bst
DEBUG: Early enablement of messages
Loading: 003
[pid:1396 id:001][--:--:--][ ][ main:test.bst
] DEBUG Created: manual element at test.bst [line 1 column 0]
[pid:1396 id:002][--:--:--][ ][ main:alpine.bst
] DEBUG Created: import element at alpine.bst [line 1 column 0]
[pid:1396 id:003][--:--:--][ ][ main:alpine.bst-0
] DEBUG Created: tar source at alpine.bst [line 4 column 2]
[pid:1396 id:004][--:--:--][ ][ main:manyfiles.bst
] DEBUG Created: import element at manyfiles.bst [line 1 column
0]
[pid:1396 id:005][--:--:--][ ][ main:manyfiles.bst-0
] DEBUG Created: tar source at manyfiles.bst [line 4 column 2]
Checking: 003/003
BuildStream Version 0.1.dev1451+gde35793.d20171110
Session Start: Wednesday, 15-11-2017 at 10:09:06
Project: alpine (/test/alpine-bst)
Targets: test.bst
User Configuration
Configuration File: Default Configuration
Log Files: /tmp/angelos/.cache/buildstream/logs
Source Mirrors: /tmp/angelos/.cache/buildstream/sources
Build Area: /tmp/angelos/.cache/buildstream/build
Artifact Cache: /tmp/angelos/.cache/buildstream/artifacts
Maximum Fetch Tasks: 10
Maximum Build Tasks: 4
Maximum Push Tasks: 4
Maximum Network Retries: 2
Pipeline
cached d92feae0 alpine.bst
cached f255b0f8 manyfiles.bst
buildable 0eb54564 test.bst
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[pid:1396 id:000][--:--:--] START Starting build
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] START
alpine/test/0eb54564-build.1403.log
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] DEBUG
Using root:
/tmp/angelos/.cache/buildstream/build/staging-cache-d13087cbaf56233b5bbdd49d070455c917042aa9c965d55ccdbedbd20ef4d2aa
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] START
Staging dependencies
[pid:1403 id:001][00:00:22][0eb54564][build:test.bst ] SUCCESS
Staging dependencies
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] START
Integrating sandbox
[pid:1403 id:001][00:00:00][0eb54564][build:test.bst ] SUCCESS
Integrating sandbox
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] START
Staging sources
[pid:1403 id:001][00:00:00][0eb54564][build:test.bst ] SUCCESS
Staging sources
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] START
Running install-commands
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] STATUS
Running install-commands
touch /buildstream/install/14
[pid:1403 id:001][00:00:00][0eb54564][build:test.bst ] SUCCESS
Running install-commands
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] START
Running strip-commands
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] STATUS
Running strip-commands
find "/buildstream/install" -type f \
'(' -perm -111 -o -name '*.so*' \
-o -name '*.cmxs' -o -name '*.node' ')' \
-exec sh -ec \
'read -n4 hdr <"$1" # check for elf header
if [ "$hdr" != "$(printf \\x7fELF)" ]; then
exit 0
fi
debugfile="/buildstream/install/usr/lib/debug/$(basename "$1")"
mkdir -p "$(dirname "$debugfile")"
objcopy --only-keep-debug --compress-debug-sections "$1" "$debugfile"
chmod 644 "$debugfile"
strip --remove-section=.comment --remove-section=.note
--strip-unneeded "$1"
objcopy --add-gnu-debuglink "$debugfile" "$1"' - {} ';'
[pid:1403 id:001][00:00:00][0eb54564][build:test.bst ] SUCCESS
Running strip-commands
[pid:1403 id:001][--:--:--][0eb54564][build:test.bst ] START
Caching Artifact
[pid:1403 id:001][00:00:00][0eb54564][build:test.bst ] SUCCESS
Caching Artifact
[pid:1403 id:001][00:00:23][0eb54564][build:test.bst ] SUCCESS
alpine/test/0eb54564-build.1403.log
[pid:1396 id:000][00:00:23] SUCCESS Build Complete
Pipeline Summary
Total: 3
Session: 1
Fetch Queue: processed 0, skipped 1, failed 0
Build Queue: processed 1, skipped 0, failed 0
DEBUG: Destroyed: manual element at test.bst [line 1 column 0]
DEBUG: Destroyed: import element at alpine.bst [line 1 column 0]
DEBUG: Destroyed: import element at manyfiles.bst [line 1 column 0]
DEBUG: Destroyed: tar source at alpine.bst [line 4 column 2]
DEBUG: Destroyed: tar source at manyfiles.bst [line 4 column 2]
real 0m24.859s
user 0m13.130s
sys 0m10.440s
```
Change test.bst to touch file '2' instead.
Now it uses the staging cache (1.5 secs):
```
time bst --debug build test.bst
DEBUG: Early enablement of messages
Loading: 003
[pid:1413 id:001][--:--:--][ ][ main:test.bst
] DEBUG Created: manual element at test.bst [line 1 column 0]
[pid:1413 id:002][--:--:--][ ][ main:alpine.bst
] DEBUG Created: import element at alpine.bst [line 1 column 0]
[pid:1413 id:003][--:--:--][ ][ main:alpine.bst-0
] DEBUG Created: tar source at alpine.bst [line 4 column 2]
[pid:1413 id:004][--:--:--][ ][ main:manyfiles.bst
] DEBUG Created: import element at manyfiles.bst [line 1 column
0]
[pid:1413 id:005][--:--:--][ ][ main:manyfiles.bst-0
] DEBUG Created: tar source at manyfiles.bst [line 4 column 2]
Checking: 003/003
BuildStream Version 0.1.dev1451+gde35793.d20171110
Session Start: Wednesday, 15-11-2017 at 10:10:59
Project: alpine (/test/alpine-bst)
Targets: test.bst
User Configuration
Configuration File: Default Configuration
Log Files: /tmp/angelos/.cache/buildstream/logs
Source Mirrors: /tmp/angelos/.cache/buildstream/sources
Build Area: /tmp/angelos/.cache/buildstream/build
Artifact Cache: /tmp/angelos/.cache/buildstream/artifacts
Maximum Fetch Tasks: 10
Maximum Build Tasks: 4
Maximum Push Tasks: 4
Maximum Network Retries: 2
Pipeline
cached d92feae0 alpine.bst
cached f255b0f8 manyfiles.bst
buildable 34f5b215 test.bst
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[pid:1413 id:000][--:--:--] START Starting build
[pid:1420 id:001][--:--:--][34f5b215][build:test.bst ] START
alpine/test/34f5b215-build.1420.log
[pid:1420 id:001][--:--:--][34f5b215][build:test.bst ] DEBUG
Using cache:
/tmp/angelos/.cache/buildstream/build/staging-cache-d13087cbaf56233b5bbdd49d070455c917042aa9c965d55ccdbedbd20ef4d2aa
[pid:1420 id:001][--:--:--][34f5b215][build:test.bst ] START
Running install-commands
[pid:1420 id:001][--:--:--][34f5b215][build:test.bst ] STATUS
Running install-commands
touch /buildstream/install/15
[pid:1420 id:001][00:00:00][34f5b215][build:test.bst ] SUCCESS
Running install-commands
[pid:1420 id:001][--:--:--][34f5b215][build:test.bst ] START
Running strip-commands
[pid:1420 id:001][--:--:--][34f5b215][build:test.bst ] STATUS
Running strip-commands
find "/buildstream/install" -type f \
'(' -perm -111 -o -name '*.so*' \
-o -name '*.cmxs' -o -name '*.node' ')' \
-exec sh -ec \
'read -n4 hdr <"$1" # check for elf header
if [ "$hdr" != "$(printf \\x7fELF)" ]; then
exit 0
fi
debugfile="/buildstream/install/usr/lib/debug/$(basename "$1")"
mkdir -p "$(dirname "$debugfile")"
objcopy --only-keep-debug --compress-debug-sections "$1" "$debugfile"
chmod 644 "$debugfile"
strip --remove-section=.comment --remove-section=.note
--strip-unneeded "$1"
objcopy --add-gnu-debuglink "$debugfile" "$1"' - {} ';'
[pid:1420 id:001][00:00:00][34f5b215][build:test.bst ] SUCCESS
Running strip-commands
[pid:1420 id:001][--:--:--][34f5b215][build:test.bst ] START
Caching Artifact
[pid:1420 id:001][00:00:00][34f5b215][build:test.bst ] SUCCESS
Caching Artifact
[pid:1420 id:001][00:00:00][34f5b215][build:test.bst ] SUCCESS
alpine/test/34f5b215-build.1420.log
[pid:1413 id:000][00:00:00] SUCCESS Build Complete
Pipeline Summary
Total: 3
Session: 1
Fetch Queue: processed 0, skipped 1, failed 0
Build Queue: processed 1, skipped 0, failed 0
DEBUG: Destroyed: manual element at test.bst [line 1 column 0]
DEBUG: Destroyed: import element at alpine.bst [line 1 column 0]
DEBUG: Destroyed: import element at manyfiles.bst [line 1 column 0]
DEBUG: Destroyed: tar source at alpine.bst [line 4 column 2]
DEBUG: Destroyed: tar source at manyfiles.bst [line 4 column 2]
real 0m1.451s
user 0m0.640s
sys 0m0.100s
```
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]