On Fri, 2017-03-17 at 09:21 -0400, Colin Walters wrote:
On Wed, Mar 15, 2017, at 12:55 PM, Philip Withnall wrote:Hi list, I’ve been thinking about the best way to upstream the support for grabbing OSTree updates from the local network and from USB drives which Endless OS has ...That's pretty cool!I’ve been thinking today about whether we should instead be treating them like mirrors of the canonical upstream remote. I *think* we could potentially already implement this by having some external process generate a local mirrorlist file for each remote,Currently, the mirrorlist code *does* fetch one-by-one in order, but I'm not sure I want to make that ABI. I guess though we could easily have some logic like "try file:// URIs first" even if in the future we add parallelization or "fastestmirror" type logic.
I was thinking about this (and decided it would make my e-mail too long, so omitted it), but you’re bang on. I was assuming that mirror support would be improved at some point to parallelise requests, and that would be fine for this use case as long as: • ‘file:’ URIs were prioritised first (faster access) • mirrors are prioritised if their summary file is more up to date than others • mirrors are discounted if they refuse a connection (not serving to peers) • mirrors are discounted if they return a 404 for the summary file (not serving OSTree content) • mirrors are not discounted (but maybe deprioritised) if they return a 404 for an object, since different LAN peers might have different sets of objects available, and a peer which returns a 404 for one object might also be the only peer on the LAN which has a copy of another object* • mirrors which otherwise have equal priority are accessed in a random order (load balancing) * This is unlikely, and probably would only arise if peer C is downloading from peers A and B who are both still in the process of downloading objects from the internet, but have different sets of up- to-date objects available. Still, better safe than sorry: it’s better to try more exhaustively and get more 404s over the LAN than to fail a pull because of erroneously discounting a peer.
and update it just before polling for updates. However, this is a bit ugly, prone to synchronisation problems between the update processes,Because it involves mutating the remote on disk?
Yeah, and hence if the daemon is updating the remotes on disk, another process which is about to do a pull can’t be entirely sure that the remotes.d files are up to date without synchronising state with the daemon.
One approach we could take would be add support for e.g. `/run/ostree/remotes.d/foo.d` where eos-updater would have the logic for "okay, a USB device was plugged in", then: cat > /run/ostree/remotes.d/endlessos.conf.d/usb.conf << EOF [remote "endless"] prependurls=file:///mnt/usb/repo; EOF Where "prependurls" is prepended to the main `url=` line from the remote.and would mean having a daemon in ostree.git (which I know people are against).To be clear I'm not personally against it strictly - we already have a somewhat messy situation with the "ostree-host" portions living in the same repo, which means downstreams already need to subpackage, see e.g.: https://bugzilla.redhat.com/show_bug.cgi?id=1331369 But keep in mind we have an `ostreedev/` github org - we can easily have `ostreedev/ostreed` in the future.
OK. I feel I’m also averse to adding a daemon for keeping the remote list up to date because there’s no real shared state to maintain which isn’t already maintained elsewhere: • avahi-daemon keeps a list of the peers which advertise given DNS-SD records • gvfs/the kernel keep a list of what’s mounted There’s no motivation for a daemon as a cache, either, since querying this information from those sources doesn’t take much time.
Instead, I’m wondering whether libostree could gain explicit support for finding mirrors using Avahi and monitoring mounts, similarly to how it can currently resolve metalink and mirrorlist files. So for Avahi support, I’d define a DNS-SD record format, and some repository configuration which would enable it.This part is *clearly* a daemon, right? The enablement of this feature seems like it'd be a systemd unit, not a flag in the repo. (Or well, I guess it depends whether we're talking about the client or server logic)
Sorry, I was talking about the client logic. I meant ‘repository configuration’ to enable code to ‘look for remotes using Avahi before you start pulling’. But yes, the server part of things is clearly a separate daemon, with its own configuration. I didn’t discuss it much in my original e-mail because it’s fairly self-contained, and – for Endless – it’s a solved problem using eos-update-server[1] and eos-updater-avahi[2]. Although that would need some simple tweaking to support serving multiple OSTree repositories, and the DNS-SD record would need to expose more data.
The format would need to scale to multiple repositories (for example, if a machine used OSTree for the OS, and flatpak for apps). Any machines on the local network which advertise repositories matching the local repository’s canonical remote URI,Maybe instead we should have a UUID field in repositories in the repo/config? We could easily change `ostree init` to generate one. (Although it'd be prone to being copied if people used `rsync`)
The need for checking against a canonical remote URI is to avoid pulling all commit metadata from a remote on the local network in order to be reasonably sure that what it calls ‘ref1’ has what the local machine calls ‘ref1’ as an ancestor. A UUID would do this just as well, but would be less human readable, so would be prone to not being changed when appropriate, and people would have to look up which repository it refers to every time. On the other hand, having it stored in the repository means less faff with out-of-band metadata. Couldn’t the checksum of the root commit in the repository be used as the repository’s UUID? That would save having to generate one, and would fit in directly with the property we want to check (ancestry). (Does OSTree support multiple unparented commits in the same repository? If so, that scheme wouldn’t work.) --- Thinking about this from a step back, are mirrors the best idea here? I initially mentioned them because they’re a way which this could be implemented using OSTree in its current state. But they might not be the most general approach for long-term Avahi support in OSTree. Putting aside the Endless OS case (where we want everything to be a mirror of our upstream repository), do we want to support a system where a remote can be dynamically added to the local repository for each peer on a local network, and all sorts of divergent commit histories pulled from them — not just mirrors of the same commit history for a given ref? This could collapse down to the case Endless care about, through use of GPG signing to enforce a linear commit history for each ref. (I’ve never actually tried using OSTree in such a distributed manner, so I might be wrong on the details here; please correct me if so.) If so, we’d lose the nice property of the mirror approach where the non-trivial code which works out how to prioritise the repositories lives in libostree. Perhaps a new ostree_repo_pull_all() could be added to implement this logic for pulling from all known (statically configured + dynamically detected) remotes? It would have the same arguments as ostree_repo_pull(), but without the remote_name. Even so, in the case of using remotes rather than mirrors, we’d still need some identifier to make sure that a flatpak repository advertised on the LAN doesn’t get pulled in to the local machine’s OS repository. We could use human-defined identifiers which are a property of the DNS- SD configuration, rather than the repositories? Then it’s possible for the sysadmins to define the domain through which each repository is shared by changing the DNS-SD configuration for the server and client sides. For example, advertise the following services using DNS-SD: • eos.ostree._tcp • flatpak.ostree._tcp • my_random_use_of_ostree_repos.ostree._tcp Each DNS-SD service would correspond to a single repository being served by that peer, and would expose DNS-SD records giving its refs (and the commits they point to) and the port to download from.
would be added as HTTP mirrors. This would require some server support (at the minimum, an Avahi .service file and a running OSTree simple HTTP server).This is the part that is a daemon.
Yup, sorry for not being clearer about that. Again, see [1] and [2] for what we have already for this.
For USB drive support, I’d define some repository configuration which enables it and specifies which well-known directory to look for in each mount (for example, check if $mount/.updates-in-here is an OSTree repository for each $mount point when we do a pull; ‘.updates-in- here’ would be specified in the repository configuration), and adds each of them as a mirror.Circling back to the top then, my instinct in the short term is to change libostree to make it less awkward for daemons to inject configuration into the remote, and add some baked in logic like "try file:// URIs first".
I’d like your thoughts on a non-daemon-based solution, since I’m pretty convinced that a daemon is overkill for the client side here. Sorry, this got a bit rambly. :-( Philip [1]: https://github.com/endlessm/eos-updater/blob/master/src/eos-update -server.c [2]: https://github.com/endlessm/eos-updater/tree/master/eos-updater-av ahi
Attachment:
signature.asc
Description: This is a digitally signed message part