Re: Adding Avahi and USB drive support to libostree



On Fri, 2017-03-17 at 09:21 -0400, Colin Walters wrote:
On Wed, Mar 15, 2017, at 12:55 PM, Philip Withnall wrote:
Hi list,

I’ve been thinking about the best way to upstream the support for
grabbing OSTree updates from the local network and from USB drives
which Endless OS has ...

That's pretty cool!

I’ve been thinking today about whether we should instead be
treating
them like mirrors of the canonical upstream remote. I *think* we
could
potentially already implement this by having some external process
generate a local mirrorlist file for each remote, 

Currently, the mirrorlist code *does* fetch one-by-one in
order, but I'm not sure I want to make that ABI.  I guess though we
could
easily have some logic like "try file:// URIs first" even if in the
future we add parallelization or "fastestmirror" type logic.

I was thinking about this (and decided it would make my e-mail too
long, so omitted it), but you’re bang on. I was assuming that mirror
support would be improved at some point to parallelise requests, and
that would be fine for this use case as long as:
 • ‘file:’ URIs were prioritised first (faster access)
 • mirrors are prioritised if their summary file is more up to date
than others
 • mirrors are discounted if they refuse a connection (not serving to
peers)
 • mirrors are discounted if they return a 404 for the summary file
(not serving OSTree content)
 • mirrors are not discounted (but maybe deprioritised) if they return
a 404 for an object, since different LAN peers might have different
sets of objects available, and a peer which returns a 404 for one
object might also be the only peer on the LAN which has a copy of
another object*
 • mirrors which otherwise have equal priority are accessed in a random
order (load balancing)


* This is unlikely, and probably would only arise if peer C is
downloading from peers A and B who are both still in the process of
downloading objects from the internet, but have different sets of up-
to-date objects available. Still, better safe than sorry: it’s better
to try more exhaustively and get more 404s over the LAN than to fail a
pull because of erroneously discounting a peer.

and update it just
before polling for updates. However, this is a bit ugly, prone to
synchronisation problems between the update processes, 

Because it involves mutating the remote on disk?

Yeah, and hence if the daemon is updating the remotes on disk, another
process which is about to do a pull can’t be entirely sure that the
remotes.d files are up to date without synchronising state with the
daemon.

One approach
we could take would be add support for e.g.
`/run/ostree/remotes.d/foo.d`
where eos-updater would have the logic for "okay, a USB device was
plugged
in", then:

cat > /run/ostree/remotes.d/endlessos.conf.d/usb.conf << EOF
[remote "endless"]
prependurls=file:///mnt/usb/repo;
EOF

Where "prependurls" is prepended to the main `url=` line from the
remote.

and would mean
having a daemon in ostree.git (which I know people are against).

To be clear I'm not personally against it strictly - we already have
a somewhat
messy situation with the "ostree-host" portions living in the same
repo, which
means downstreams already need to subpackage, see e.g.:
https://bugzilla.redhat.com/show_bug.cgi?id=1331369

But keep in mind we have an `ostreedev/` github org - we can easily
have
`ostreedev/ostreed` in the future.

OK. I feel I’m also averse to adding a daemon for keeping the remote
list up to date because there’s no real shared state to maintain which
isn’t already maintained elsewhere:
 • avahi-daemon keeps a list of the peers which advertise given DNS-SD
records
 • gvfs/the kernel keep a list of what’s mounted

There’s no motivation for a daemon as a cache, either, since querying
this information from those sources doesn’t take much time.

Instead, I’m wondering whether libostree could gain explicit
support
for finding mirrors using Avahi and monitoring mounts, similarly to
how
it can currently resolve metalink and mirrorlist files.
So for Avahi support, I’d define a DNS-SD record format, and some
repository configuration which would enable it.

This part is *clearly* a daemon, right?  The enablement of this
feature
seems like it'd be a systemd unit, not a flag in the repo.

(Or well, I guess it depends whether we're talking about the client
 or server logic)

Sorry, I was talking about the client logic. I meant ‘repository
configuration’ to enable code to ‘look for remotes using Avahi before
you start pulling’.

But yes, the server part of things is clearly a separate daemon, with
its own configuration. I didn’t discuss it much in my original e-mail
because it’s fairly self-contained, and – for Endless – it’s a solved
problem using eos-update-server[1] and eos-updater-avahi[2]. Although
that would need some simple tweaking to support serving multiple OSTree
repositories, and the DNS-SD record would need to expose more data.

The format would need
to scale to multiple repositories (for example, if a machine used
OSTree for the OS, and flatpak for apps). Any machines on the local
network which advertise repositories matching the local
repository’s
canonical remote URI, 

Maybe instead we should have a UUID field in repositories in the
repo/config?
We could easily change `ostree init` to generate one.
(Although it'd be prone to being copied if people used `rsync`)

The need for checking against a canonical remote URI is to avoid
pulling all commit metadata from a remote on the local network in order
to be reasonably sure that what it calls ‘ref1’ has what the local
machine calls ‘ref1’ as an ancestor. A UUID would do this just as well,
but would be less human readable, so would be prone to not being
changed when appropriate, and people would have to look up which
repository it refers to every time. On the other hand, having it stored
in the repository means less faff with out-of-band metadata.

Couldn’t the checksum of the root commit in the repository be used as
the repository’s UUID? That would save having to generate one, and
would fit in directly with the property we want to check (ancestry).

(Does OSTree support multiple unparented commits in the same
repository? If so, that scheme wouldn’t work.)

---

Thinking about this from a step back, are mirrors the best idea here? I
initially mentioned them because they’re a way which this could be
implemented using OSTree in its current state. But they might not be
the most general approach for long-term Avahi support in OSTree.

Putting aside the Endless OS case (where we want everything to be a
mirror of our upstream repository), do we want to support a system
where a remote can be dynamically added to the local repository for
each peer on a local network, and all sorts of divergent commit
histories pulled from them — not just mirrors of the same commit
history for a given ref?

This could collapse down to the case Endless care about, through use of
GPG signing to enforce a linear commit history for each ref.

(I’ve never actually tried using OSTree in such a distributed manner,
so I might be wrong on the details here; please correct me if so.)

If so, we’d lose the nice property of the mirror approach where the
non-trivial code which works out how to prioritise the repositories
lives in libostree. Perhaps a new ostree_repo_pull_all() could be added
to implement this logic for pulling from all known (statically
configured + dynamically detected) remotes? It would have the same
arguments as ostree_repo_pull(), but without the remote_name.

Even so, in the case of using remotes rather than mirrors, we’d still
need some identifier to make sure that a flatpak repository advertised
on the LAN doesn’t get pulled in to the local machine’s OS repository.
We could use human-defined identifiers which are a property of the DNS-
SD configuration, rather than the repositories? Then it’s possible for
the sysadmins to define the domain through which each repository is
shared by changing the DNS-SD configuration for the server and client
sides.

For example, advertise the following services using DNS-SD:
 • eos.ostree._tcp
 • flatpak.ostree._tcp
 • my_random_use_of_ostree_repos.ostree._tcp

Each DNS-SD service would correspond to a single repository being
served by that peer, and would expose DNS-SD records giving its refs
(and the commits they point to) and the port to download from.

would be added as HTTP mirrors. This would
require some server support (at the minimum, an Avahi .service file
and
a running OSTree simple HTTP server).

This is the part that is a daemon.

Yup, sorry for not being clearer about that. Again, see [1] and [2] for
what we have already for this.

For USB drive support, I’d define some repository configuration
which
enables it and specifies which well-known directory to look for in
each
mount (for example, check if $mount/.updates-in-here is an OSTree
repository for each $mount point when we do a pull; ‘.updates-in-
here’
would be specified in the repository configuration), and adds each
of
them as a mirror.

Circling back to the top then, my instinct in the short term is to
change
libostree to make it less awkward for daemons to inject configuration
into the remote, and add some baked in logic like "try file:// URIs
first".

I’d like your thoughts on a non-daemon-based solution, since I’m pretty
convinced that a daemon is overkill for the client side here.

Sorry, this got a bit rambly. :-(

Philip

[1]: https://github.com/endlessm/eos-updater/blob/master/src/eos-update
-server.c
[2]: https://github.com/endlessm/eos-updater/tree/master/eos-updater-av
ahi

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]