Re: Redistributing refs from multiple origins in a single repository

From: Philip Withnall <philip tecnocode co uk>
To: Alexander Larsson <alexl redhat com>, ostree-list gnome org
Subject: Re: Redistributing refs from multiple origins in a single repository
Date: Tue, 30 May 2017 15:31:45 +0100

On Tue, 2017-05-30 at 10:13 +0200, Alexander Larsson wrote:

On Sat, 2017-05-27 at 23:53 +0100, Philip Withnall wrote:

Hi all,

Here’s a bit of a writeup about something which I’ve been
discussing
with Colin and Alex recently. It’s primarily of interest to them,
but
it affects the core of OSTree and how flatpak uses OSTree, so
feedback
from anyone else (especially other users of OSTree) is very
welcome.


Overall I agree with this approach. Some details below:


Thanks for the feedback!

Various people have suggested a different approach which
disambiguates
ref names based on a second token (an ‘origin ID’, which has
previously
been called an ‘originish’, but that’s not very obvious
terminology),
so the combination of (origin ID, ref name) is globally unique.


I liked the 'originish' name, but i realize it doesn't sound very
professional. :)


I’m personally fine with ‘originish’ (it tickles me), but I think it’s
a bit opaque to newcomers. ‘Origin ID’ is, unfortunately, a little more
immediately understandable.

Open questions
===

For convenience, once you’ve read the sections below:

 - Is detached metadata signed? If so, would it be a better place
to
put a ref list than the commit metadata? §(Unsigned summaries)


Detached metadata is never signed. In fact, it is typically used to
sign unsigned commits after-the fact, which makes it impossible for
it
to be signed. However, it could contain inline signed data. For
instance, one could add a new metadata field with gpg signature of a
(commit-id, new-ref) tuple.


That would work, although in this case it would be a new a(ss) field
which contains (origin ID, ref name) tuples.

 - Are static deltas signed (like commits)? §(Unsigned summaries)


No, static deltas are not signed. However, the commit objects in them
are. So, if we trust the delta apply mechanism to be safe (in that it
can only create new, correct objects) then as long as the commit
object
signature verifies in the end, then the delta should be fine.


So you’re saying that every bit of
OSTREE_STATIC_DELTA_SUPERBLOCK_FORMAT is represented in the
reconstructed ‘to’ commit?

So, I don't really think we need to sign these.

 - What naming scheme do we want to use for origin IDs? §(Origin
naming
scheme)


I'm partial to the reverse-dns style, for consistence if nothing
else.
That tends to create rather long names though, and the remote name is
typed in a lot of commands, so maybe not ideal.


Tab-completion may help here?

Regardless of the format of origin IDs, the .flatpakrepo and
.flatpakref formats should acquire a new key to specify the
repository’s origin ID. This would make the remote name argument to
`flatpak remote-add` optional.


This last sentence is a bit unclear. We could already generate a
remote
name if we wanted. For instance from the basename of the flatpakrepo
file. However, the reason we don't is that the remote name is a point
of trust in the system, and some shady flatpakrepo file could claim
to
be the remote called "official-firefox" or something. To avoid this
the
user is always in control of that.


I’m not sure how the remote name is supposed to function as a point of
trust in the system. Am I right in thinking you’re trying to prevent
the situation where the user downloads a .flatpakref file for a new
game (for example) from a third-party website; but it secretly
configures a ‘official-firefox’ repository and starts listing fake
Firefox versions in gnome-software?

To mitigate that risk you don’t need to make the user type out the
remote name; they just need to be asked to validate it. And then gnome-
software needs to make sure to make the remote obvious when installing
software.

Of course, in the P2P case we generally *do* want the remote to have
a
specific name, so maybe we want to avoid the user typing the wrong
thing there.


If the user validates a remote name which is given in the .flatpak*
file (either as its basename, or as a key in the file), that would
avoid input errors.

In addition, an origin ID needs to be included in the commit
metadata,
paired with each ref name; otherwise an attacker could make a
commit
from one origin (in a P2P server) be pointed at by an identically
named
ref in another origin. This situation is not as rare as one might
think: it could easily apply to the `appstream/$arch` branches
which
flatpak uses.


Well, the uncommon part would be the ref name *and* the gpg key to
match, no?


True, though it’s still something I would like to avoid by putting the
origin ID in there.

The additional metadata in the summary file should be signed as
needed,
using inline signatures. For example, this would include the
repository’s origin ID. P2P redistribution of this signed metadata
would require copying it and its signature without modification. We
would need a definition of which metadata keys need to be signed,
and
how they are merged from multiple origins when doing P2P
redistribution.
 - ostree.summary.last-modified: This can be regenerated by whoever
generates the summary file and doesn’t need to be signed (signing
it
doesn’t meaningfully prevent any attacks).


It depends on how this is used. Timestamps are generally used as a
way
to avoid MiTM attacks that downgrade, or keep you from updating. If
signed, one can trust that all the other signed metadata in the file
is as-new as this, and thus clients will never go back to earlier
metadata. It is however, not useful for non-signed data.
So, while I agree that the overall timestamp need not be signed, we
do
want timestamps for freshness on things like xa.title, xa.default-
branch, xa.redirect-url and xa.gpg-keys.


True.

 - ostree.static-deltas: Static deltas don’t appear to be
individually


As per above, I don't think we need to sign deltas.


Discussed above.

 - xa.cache: Would definitely need to be merged into a map of
origin
ID
to cache data (i.e. a map of type {s{s(tts)}}). The main `xa.cache`
key
could refer to the refs for the main origin in the summary file.
This
must be signed inline (one signature per origin entry).


This currently contains, for each ref: installed size, downloaded
size,
and the metadata file.

The sizes are honesly mostly hints to the end user, and if a server
could fake these then its not truly a problem. However, the metadata
file is security sensitive, as it contains the permissions that the
app requests, as well as dependency information (i.e. what runtime
the
app needs). If you make security sensitive decisions based on this,
then it is problematic if it is fakeable. The plan I had here was to
treat the data in the summary as a cache, and since we (recently)
added the metadata also to the commit object (which is signed), we
can
verify that the version in the summary actually matches the one we
later pulled, and if they don't, error out. That way we don't really
need to sign the cache. Really what we do is handle the metadata the
same way as we do the ref: Trust for download, but verify before use.

In fact, what I would really want is to not have this separate
xa.cache
field, but instead use the regular per-ref metadata field in the
summary. In flatpak I couldn't do this, because there is no api to
add
things to the per-ref metadata when constructing the summary file.
However, what if ostree would automatically take certain fields from
the commit object and put them in the per-ref metadata in the
summary.
And then automatically verifying that these are identical when
pulling
the commit. That seems very generic, and would solve things nicely
for
flatpak.


Sounds reasonable to me. I think this is more Colin’s domain though.

The advantages of an unsigned summary file are good:
 - No race between updating summary and summary.sig when publishing
on
a server (https://github.com/ostreedev/ostree/issues/487)
 - No need to have the signing key available and used frequently to
regenerate the summary file on a busy server like flathub
 - P2P support


One disadvantage is for caching. We're currenly polling the summary
often, because downloading the small summary.sig file is quick, and
lets us ensure that the local summary cache is fresh. If we don't do
this we need to replace this with proper ETAGS handling to avoid
constantly re-downloading the entire summary file.


ETag handling is precisely the tool for this job, and shouldn’t be hard
to add — should just be an `If-None-Match` header from the client and
an `ETag` header from the server both using hashes of the summary file.

Pulling (origin ID, ref name) from a P2P server using commits
---

 1. If the user has not pulled a ref from this origin before, they
must
configure a new remote with the appropriate GPG keyring and a name
matching the origin ID. The remote configuration does not have to
include an upstream URI for the origin, but that would allow pulls
from
the origin in future (and OSTree currently requires a URI to be
specified).


Note: Flatpak currently uses URI set to empty to mean the remote is
disabled.


We can keep those semantics, then, and require the new remote to have a
URI set if it should be enabled.

Philip

Attachment: signature.asc
Description: This is a digitally signed message part

Follow-Ups:
- Re: Redistributing refs from multiple origins in a single repository
  - From: Alexander Larsson

References:
- Redistributing refs from multiple origins in a single repository
  - From: Philip Withnall
- Re: Redistributing refs from multiple origins in a single repository
  - From: Alexander Larsson

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]