Re: Adding Avahi and USB drive support to libostree



On Wed, 2017-03-29 at 10:04 -0400, Colin Walters wrote:
On Wed, Mar 29, 2017, at 05:16 AM, Philip Withnall wrote:

In the sense that each ref has a different GPG key; or in the sense
that the ref name is a key ID? In either case, commits would need
to
support being signed by multiple keys (which I assume they do
already
by appending the signature packets in the .sig file?).

While nothing *requires* it, the naming convention established from
the start of ostree for refs encourages being globally unique - this
is
*very* different from git, where almost everyone has a `master`
branch. 
And the ref name is what's used for downloads/upgrades (by default).
I chose this because:
 - The concept of "master" doesn't make as much sense since it's
intended
    to have multiple equally-important branches (e.g.
fedora/26/x86_64/atomic-host and fedora/26/x86_64/workstation)
 - And besides the "content" differentiation, there's CPU
architecture in the branch name

That makes sense. Using the ref name as an identifier would be a
possibility, but I don't think it would work with Avahi, since a peer
would have to advertise every ref it has, which would result in a huge
about of mDNS traffic for the DNS-SD records for a flatpak repository,
for example. :-(  That's N things to advertise per repository, instead
of 1.

Are you suggesting this because of the way flatpak uses OSTree ---
one
repository which deduplicates files from various separate
applications,
each of which has its own ref, and theoretically is its own
security
domain?

Right.  And the same for the host OS case - it should never happen
that a flatpak ref is the same as any existing host ref, and all of
those
should be distinct from refs generated by e.g. the `atomic` command
importing Docker/OCI images, etc.

+1

That's a valuable use case, but not the one I am trying to address
here. The use case I am trying to address is for updating *without*
a
connection to the internet.

OK.  I read back and didn't see this mentioned from the start.

Argh, my bad, sorry. I shouldn't have jumped straight in with an
implementation suggestion.

Anyways, I'll be honest, I have very little practical experience with
mesh/P2P/decentralized
networking. 

I don't think there's too much different here: we're still using HTTP,
and the GPG security model leans towards decentralisation anyway. The
main difference is the need to advertise and find remotes.

The "completely disconnected from the Internet" model also occurs
in a primary use case for me, which is servers.  But there, what
we've
documented is changing the e.g. /etc/ostree/remotes.d/redhat.conf
file.
In the enterprise server case of course, there are sysadmins who are
dedicated to managing configuration like this, *and* it's very
typical
to model things as an "intranet" - you still use DNS, TLS (but with
custom CA certs) etc.  So pointing the remote at
mirror.examplecorp.com
works just fine and is expected.

That makes sense, but is unfortunately not applicable in the Endless OS
use case, where we want things to need zero configuration, and cannot
require custom CA certs, etc.

So let me ask this - in the mesh (is there a better/standard word?)
case,
what are you doing for *other* applications?  What does e.g. the
web browser app do?   How do you deliver and cache non-OS content?

Generally, these kinds of systems which are one hop removed from the
internet have a load of pre-loaded content (for example, an offline
copy of Wikipedia), and can be served other content from the machine
which *is* connected to the internet via a local HTTP server (although
to be honest, I don't know the details of the distribution of non-OS
content). These users are not expecting to be able to load up Facebook!

What prior art is there in other mesh software for problem sets
like naming/identification?  I glanced at https://gnunet.org/about
and indeed they seem to have a DNS-like system that's derived
from a public/private key. 

I'm not sure we need the naming of things to be distributed, do we?
These use cases care about decentralised distribution of content which
comes from some centralised authority (an OS vendor with a signing
key), so the naming can be centralised.

That said, it would be nice to make the naming decentralised as well. I
think the two standard approaches to decentralised naming are:
hierarchical naming, where each peer is in control of some namespace
(as you say); and probe-based naming, where each peer proposes a name
and sees if anyone else has claimed it (as per mDNS A/AAAA records).

In both cases, there needs to be some verifiable binding between the
repository identifier and the content served by it; i.e. some crypto.

So from that perspective, perhaps then you're down the right path
with identifying remotes via GPG keys.  

It does eliminate the need to separately bind the repository's
identifier to the key. That said, that would simply be a case of
putting the identifier in the summary file, and then summary.sig would
be a witness to the binding.

Looking at gnunet makes me wonder if torrents would be a good source of
inspiration here. It would be pretty cool if OSTree could eventually
pull from a global DHT (that's not a use case we currently have
though).

But this is going to be a challenge for me to review/co-design
because
I basically don't have experience with it.

Someone else might chime in on the mailing list. :-)  As long as we
have solid use cases and reasoning behind anything we come up with, I'm
happy to press on and see what happens. (Ideally quickly, because I've
got 1.5 months to get this + higher level stuff in flatpak and gnome-
software implemented!)

For example, a machine is initially installed, and has no internet
access. It needs to be regularly updated from a USB stick.

My feeling is that we can deal with this case and the others
you have by: Try to resolve the requested ref in each remote (sorted
by
priority, file:// first etc), downloading the commit object, verify
signature,
then take the *newest* commit from that set.

Right?

Yup. Though sorting by priority doesn't help when you're downloading
the commit object; it should be done once the commits have all been
downloaded, to address the case where the most recent commit is
available via a slow transport and a fast one.

This can be sped up in the Avahi case by including the refspec and
commit timestamp in DNS-SD records, so the summary file doesn't have to
be download from all peers --- just the ones we choose to pull objects
from. Though this would run into the oversized DNS-SD record problem I
mentioned before.

Philip

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]