Re: [BuildStream] cache key changes



Hi Darius,

TL;DR: I think we should sort() the `dependencies` (and their new counterparts).

Longer:

We shortly talked about it in the same discussion but I would like to
raise an issue publicly about all `LIST` types in the cache key
dictionary. I think applies for both the current version and this
version. Let's take the old one. List types are:
- Sources  = same order in the bst file
- fatal-warnings = sorted
- dependencies = order determined by `Element.dependencies()`

While ordering of sources and fatal-warnings makes sense, the last one
doesn't make much to me. `Element.dependencies()` is at the moment the
all-in-one function that does the graph traversal and used by many
other places, including `_calculate_cache_key`. It was recently
optimized and according to its docs (from 2 years ago):

```
If `recurse` is specified (the default), the full dependencies will be listed
in deterministic staging order, starting with the basemost elements in the
given `scope`.
```

I think this traversing order is an unnecessary, extra, perhaps
harmful detail while calculating the cache-key. Ideally, I think it
should be a 'set' but I don't know a way of encoding them other than
using a sorted list. This can have to gains:
- In case we change the traversing order (perf reasons, scheduler
related things, etc.) cache-keys won't get affected.
- It MIGHT! pave the way for a more lightweight traversal method,
indeterministic or not, only to be used for cache-key-calc purposes,
potentially speeding up anything

Best,
Gokcen

Darius Makovsky via buildstream-list <buildstream-list gnome org>, 5
Tem 2019 Cum, 15:00 tarihinde şunu yazdı:

During Thursday's discussion about cache key generation it was agreed
that there should be a revised public specification for the cache key
dictionary. The current cache key dictionary is specified at
https://gitlab.com/BuildStream/buildstream/blob/master/src/buildstream/element.py#L2205-2222.
The current proposal is to alter and formalize that as the following:

```JSON
{
    core-version: <INT>,
    element-plugin-key: <*>,
    element-plugin-name: <STR>,
    element-plugin-version: <INT>,
    dependencies-strong: [<STR>],
    dependencies-weak: [<STR>],
    environment-variables: {[<STR>: <STR>]},
    sandbox: {[<STR>: <STR>], "tainted": <BOOL>},
    sources: [{"key": <*>, "name": <STR>, "version": <INT>}],
    public: {...},
    fatal-warnings: [<STR>]
}
```

This comprises the following changes:
* `artifact-version` is replaced by `core-version`
* `element` is replaced by:
    - `element-plugin-key`
* `element-plugin-name` is added
* `element-plugin-version` is added
* `dependencies` is replaced by:
    - `dependencies-strong` (containing the strong cache keys of
dependencies)
    - `dependencies-weak` (containing the weak keys of dependencies)
* `context` is removed
* `project` is removed
* `environment` is replaced `environment-variables`
* `execution-environment` is replaced by `sandbox`
* `sources` is now a list of dictionaries of name, version, and key.
This may be overwritten by the values of `workspaces` which is removed
* `fatal-warnings` is unchanged
* `public` is unchanged
* `cache` is removed

Some of these keys are renamed to better suggest the function which
generates their values.

It is proposed that any default values are not inserted into this
dictionary since they do not add value to the cache key generation. In
addition it is proposed to insert `public-nocache` into the dictionary.
Values which are publically defined but not affecting the build result
constitute the values of `public-nocache`.

There is an additional point to raise concerning the return of
`get_unique_key` in plugins: I think this should be standardized as some
sort of serializable type. A dictionary would probably be a good choice.

--
Best Regards,
Darius


For Codethink's privacy-policy please see
https://www.codethink.co.uk/privacy.html
_______________________________________________
buildstream-list mailing list
buildstream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]