[BuildStream] Protect against plugin modifications of artifacts



Hi all,

TLDR;

   This is a motion to destroy any and all possible public APIs from
   the virtual directory API[0] and any related APIs, which mistakenly
   give plugin authors the ability to mutate the content and attributes
   of files within the sandbox or artifacts, or to import files from
   outside the sandbox which are not otherwise imported by Sources.

   This is not a proposal for anything new, this is a course correction
   to ensure that filesystem permutations can only ever happen within
   the sandbox, as they were only ever allowed to happen in the
   sandbox.

In this email I will try to piece together a story of how things went
awry in this regard.


How we got tangled into this mess
=================================
In a nutshell, this all starts with the collect_manifest plugin from
the freedesktop-sdk project, which exploits a weakness in our older
API, which exposes a directory on disk for the purpose of allowing
plugin authors to stage files from artifacts into directories (it's
important that plugins are allowed to put files wherever they want
inside a sandbox when implementing Element.stage()), this weakness in
our API was exploited to have the plugin code actually *create* files
in the sandbox which would subsequently be wrapped up into an artifact.

Consequently, this breaks the promise BuildStream makes that all
filesystem data permutations are performed inside an isolated and
controlled sandbox environment.

  * The driver for this popped up in early 2018 when this BuildStream
    issue[1] and this freedesktop-sdk issue[2] were opened.

    These issues outline a need from the freedesktop-sdk to create
    a "manifest", which is to say that they essentially want something
    descriptive in the output, which describes everything which went
    into the input, to create the builds.

  * This was later discussed on the mailing list[3], at which point
    my position was that we should be able to extract this information
    using scripting with the `bst` command line.

    However, I pointed out in this mail that we do not currently have
    any way for Sources to report standardized information about the
    source URL or what version it was.

    As the 'url' and 'ref' fields are entirely in the Source domain,
    the core does not know what attribute name is used for a `url`
    or `ref` (in the YAML), nor does it know anything about the content
    structure of a `ref`.

    I also pointed out that this was discussed before, it has been a
    long standing issue that we would like to have something like
    `bst source show` or some standard way to extract this information.

    I should point out now that my expectation at that point, was of
    course for the freedesktop-sdk team to get involved in developping
    this standardized way of extracting original source information,
    this would be the natural next step given that it is a long
    standing issue we've already agreed needs to be fixed, and that
    fixing it would solve the problem which the freedesktop-sdk team
    was faced with (extracting and generating manifests).

    Instead, the following things happened.

  * Later in 2018, this freedesktop-sdk issue[4] was opened, followed
    by this merge request[5].

    Thus was born the hack we know today as the freedesktop-sdk
    `collect_manifest` plugin, which does the following things:

      * The plugin makes assumptions that the plugins in dependency
        elements are what they expect by accessing "kind" and then
        accessing private API which cannot be guaranteed to work.

        This is still a problem today, and is one of the first issues
        of the bst-plugins-experimental project[6]

      * Exploits the availability of the `directory` argument to
        Element.stage() and such APIs, to break the contract that
        filesystem data will only ever be modified inside the sandbox
        environment (which is under discussion today)

  * Around the time this collect_manifest hack was created, I recall
    being swamped with bst 1.2 related details, so I did not have much
    time slices to allocate to freedesktop-sdk.

    However, I do clearly recall being asked if this exploitation of
    the Sandbox was correct, I'm pretty sure that Javier had concerns
    that this was breaking the sandbox contract and reached out to me.

    I do recall making it clear that this was wrong at the time, but
    I guess I did not do anything further like commenting on issues
    or mailing lists for posterity (which would be useful now).

    I just think it's pretty obvious: modifying the sandbox content
    using python plugin code has forever been wrong.

  * Since this time, things have apparently gotten worse and worse.

    * There was yet another plugin which followed the collect_manifest
      exploitation as if it were a pattern supported by BuildStream,
      and thus was born the `oci` plugin which equally exploits the
      sandbox.

    * With the development of REAPI, we introduced the virtual
      directory APIs (which have recently become mandatory).

      This was the *perfect* opportunity to banish these exploitations,
      and ensure that plugins *cannot* exploit the API to circumvent
      sandboxing protections, and modify files in the sandbox.

      Unfortunately, we made the situation worse by *supporting*
      this exploitation in first class APIs in the virtual directory
      APIs.

      In other words, we went out of our way to continue supporting
      a construct which has always been an illegal hack.


What to do now for BuildStream 2.0
==================================
This is my recommended course of action:

  * Banish any new APIs we've created which treat sandbox abuse as
    a supported feature.

    I.e. remove any public API surfaces which allow this to happen
    before it's too late.

    This should be done ASAP and should be a blocker for 2.0.

  * Do not block 2.0 on any alternative support for collect_manifest or
    oci plugins.

    At least for the collect_manifest plugin, I feel confident that
    this was a quick workaround to avoid getting involved and fixing
    the problem by properly adding support for what then needed in
    upstream BuildStream proper.

    The people who originally wrote these plugins knew full well that
    it was a hacky exploitation of the API.

    There can be absolutely no reasonable expectation that BuildStream
    continue to support this exploitation.


What to do for manifests and oci plugins
========================================
Go back to the drawing board.

For collect manifest, I strongly suggest revisiting the mailing list
discussions[3] from 2018 and instead of working around the issue with a
quick hack, start getting to work on the root cause of this issue which
is that BuildStream cannot export originating source URLs in a standard
format (in any case, collect_manifest.py has no realistic future and
needs to be rewritten due to the aforementioned issue[6])

For the `oci` plugin, I am not sure what details caused it to become
dependent on this exploitation of the sandbox, but certainly there are
other ways to do this without having plugin python manipulating file
content in the sandbox.

  * The plugin could export public data to the sandbox via environment
    variables, allowing the sandbox to execute scripts which generate
    whatever is needed.

  * BuildStream could be given a feature to *stage* public data in a
    sandbox relative location for elements to process.

    While this would be borderline similar to the existing exploits,
    at least it would be maintained in the core once, where we could
    take special care to ensure that staged public data is always
    sorted stably and content is never changed in the future.

But in short, the authors of these plugins should be doing the work to
get their features supported properly upstream, and BuildStream 2.0
should not be held hostage to provide alternative support for a feature
which was never a feature in the first place.


So, what are your thoughts ?

Best regards,
    -Tristan


[0]: https://gitlab.com/BuildStream/buildstream/-/blob/master/src/buildstream/storage/directory.py
[1]: https://gitlab.com/BuildStream/buildstream/-/issues/235
[2]: https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/issues/55
[3]: https://mail.gnome.org/archives/buildstream-list/2018-August/msg00013.html
[4]: https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/issues/376
[5]: https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/merge_requests/735
[6]: https://gitlab.com/BuildStream/bst-plugins-experimental/-/issues/2






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]