On Wed, 2017-03-29 at 10:04 -0400, Colin Walters wrote:
On Wed, Mar 29, 2017, at 05:16 AM, Philip Withnall wrote:In the sense that each ref has a different GPG key; or in the sense that the ref name is a key ID? In either case, commits would need to support being signed by multiple keys (which I assume they do already by appending the signature packets in the .sig file?).While nothing *requires* it, the naming convention established from the start of ostree for refs encourages being globally unique - this is *very* different from git, where almost everyone has a `master` branch. And the ref name is what's used for downloads/upgrades (by default). I chose this because: - The concept of "master" doesn't make as much sense since it's intended to have multiple equally-important branches (e.g. fedora/26/x86_64/atomic-host and fedora/26/x86_64/workstation) - And besides the "content" differentiation, there's CPU architecture in the branch name
That makes sense. Using the ref name as an identifier would be a possibility, but I don't think it would work with Avahi, since a peer would have to advertise every ref it has, which would result in a huge about of mDNS traffic for the DNS-SD records for a flatpak repository, for example. :-( That's N things to advertise per repository, instead of 1.
Are you suggesting this because of the way flatpak uses OSTree --- one repository which deduplicates files from various separate applications, each of which has its own ref, and theoretically is its own security domain?Right. And the same for the host OS case - it should never happen that a flatpak ref is the same as any existing host ref, and all of those should be distinct from refs generated by e.g. the `atomic` command importing Docker/OCI images, etc.
+1
That's a valuable use case, but not the one I am trying to address here. The use case I am trying to address is for updating *without* a connection to the internet.OK. I read back and didn't see this mentioned from the start.
Argh, my bad, sorry. I shouldn't have jumped straight in with an implementation suggestion.
Anyways, I'll be honest, I have very little practical experience with mesh/P2P/decentralized networking.
I don't think there's too much different here: we're still using HTTP, and the GPG security model leans towards decentralisation anyway. The main difference is the need to advertise and find remotes.
The "completely disconnected from the Internet" model also occurs in a primary use case for me, which is servers. But there, what we've documented is changing the e.g. /etc/ostree/remotes.d/redhat.conf file. In the enterprise server case of course, there are sysadmins who are dedicated to managing configuration like this, *and* it's very typical to model things as an "intranet" - you still use DNS, TLS (but with custom CA certs) etc. So pointing the remote at mirror.examplecorp.com works just fine and is expected.
That makes sense, but is unfortunately not applicable in the Endless OS use case, where we want things to need zero configuration, and cannot require custom CA certs, etc.
So let me ask this - in the mesh (is there a better/standard word?) case, what are you doing for *other* applications? What does e.g. the web browser app do? How do you deliver and cache non-OS content?
Generally, these kinds of systems which are one hop removed from the internet have a load of pre-loaded content (for example, an offline copy of Wikipedia), and can be served other content from the machine which *is* connected to the internet via a local HTTP server (although to be honest, I don't know the details of the distribution of non-OS content). These users are not expecting to be able to load up Facebook!
What prior art is there in other mesh software for problem sets like naming/identification? I glanced at https://gnunet.org/about and indeed they seem to have a DNS-like system that's derived from a public/private key.
I'm not sure we need the naming of things to be distributed, do we? These use cases care about decentralised distribution of content which comes from some centralised authority (an OS vendor with a signing key), so the naming can be centralised. That said, it would be nice to make the naming decentralised as well. I think the two standard approaches to decentralised naming are: hierarchical naming, where each peer is in control of some namespace (as you say); and probe-based naming, where each peer proposes a name and sees if anyone else has claimed it (as per mDNS A/AAAA records). In both cases, there needs to be some verifiable binding between the repository identifier and the content served by it; i.e. some crypto.
So from that perspective, perhaps then you're down the right path with identifying remotes via GPG keys.
It does eliminate the need to separately bind the repository's identifier to the key. That said, that would simply be a case of putting the identifier in the summary file, and then summary.sig would be a witness to the binding. Looking at gnunet makes me wonder if torrents would be a good source of inspiration here. It would be pretty cool if OSTree could eventually pull from a global DHT (that's not a use case we currently have though).
But this is going to be a challenge for me to review/co-design because I basically don't have experience with it.
Someone else might chime in on the mailing list. :-) As long as we have solid use cases and reasoning behind anything we come up with, I'm happy to press on and see what happens. (Ideally quickly, because I've got 1.5 months to get this + higher level stuff in flatpak and gnome- software implemented!)
For example, a machine is initially installed, and has no internet access. It needs to be regularly updated from a USB stick.My feeling is that we can deal with this case and the others you have by: Try to resolve the requested ref in each remote (sorted by priority, file:// first etc), downloading the commit object, verify signature, then take the *newest* commit from that set. Right?
Yup. Though sorting by priority doesn't help when you're downloading the commit object; it should be done once the commits have all been downloaded, to address the case where the most recent commit is available via a slow transport and a fast one. This can be sped up in the Avahi case by including the refspec and commit timestamp in DNS-SD records, so the summary file doesn't have to be download from all peers --- just the ones we choose to pull objects from. Though this would run into the oversized DNS-SD record problem I mentioned before. Philip
Attachment:
signature.asc
Description: This is a digitally signed message part