Hey, On Tue, 2017-05-30 at 21:56 +0200, Krzesimir Nowak wrote:
On Sun, May 28, 2017 at 12:53 AM, Philip Withnall <philip tecnocode co uk> wrote:Hi all, Here’s a bit of a writeup about something which I’ve been discussing with Colin and Alex recently. It’s primarily of interest to them, but it affects the core of OSTree and how flatpak uses OSTree, so feedback from anyone else (especially other users of OSTree) is very welcome. Apologies for the length. There are cookies at the end.I probably will only have some bikeshedding comments here.
🚲🖌
Various people have suggested a different approach which disambiguates ref names based on a second token (an ‘origin ID’, which has previously been called an ‘originish’, but that’s not very obvious terminology), so the combination of (origin ID, ref name) is globally unique.My complaint would be that ostree already uses the "origin" name for the files related to deployments. Overloading the name with yet another meaning may be confusing - one may think that this file and the origin ID are somehow related. My ideas for a name were "source" or "init", but maybe googling for origin synonyms could result in some better alternatives.
Fair. I’ve replied to Colin about this. My suggestion is ‘collection’.
The origin ID can be added to the summary file as a new metadata key, leaving the existing ref map to be indexed by ref name as before. Semantically, all the refs in the ref map can be assumed to have that same origin ID. If the summary file contains refs from more than one origin, one of the origins is arbitrarily picked as the main one, to be treated as above; and the refs from the other origins are listed in a second map, which maps origin ID to a ref map of the refs from that origin (each with the same semantics as the main ref map). Picking one of the origins as the main origin for the summary file, rather than leaving the main ref map empty and using only the second map, means that new versions of Endless OS can propagate OS updates to older versions via P2P redistribution without needing a separate backwards compatibility path (we already do P2P OS updates without the use of an origin ID).I am not sure I understand this. Who picks the origin as the main one? The server? The client? Is main origin ID always the one with refs in the main ref map and the "propagated" origin IDs are always in the second map?
If you are publishing a repository on the internet, you pick an origin ID and include it in your summary file. If you are redistributing refs on the LAN, whatever software is doing the redistribution (in our case, eos-update-server) chooses the origin-id. The main origin-id always relates to the refs in the main ref map. Other origin IDs are included in the origin-map. i.e. The set of refs and origins we care about is (origin-id + refs-map) + origin-map. Perhaps an example would help. In this example, we’ve got two origin servers on the internet, and my computer. My computer is redistributing refs via P2P, so it also appears as a server. Origin #1: - origin-id: eos-apps - Refs: - refs/heads/app1 - refs/heads/app2 - origin-map is unset in the summary file - refs map in the summary file lists [app1, app2] - refs/remotes and refs/mirrors are empty Origin #2: - origin-id: eos-production - Refs: - refs/heads/eos/amd64/master - origin-map is unset in the summary file - refs map in the summary file lists [eos/amd64/master] - refs/remotes and refs/mirrors are empty My computer: - origin-id is unset in the summary file (there is no summary file in my local repository) - Remotes in local config: - eos-apps - eos-production - Refs: - refs/remotes/eos-apps/app1 - refs/remotes/eos-apps/app2 - refs/remotes/eos-production/eos/amd64/master - origin-map is unset (no summary file) - refs map is unset (no summary file) - refs/heads and refs/mirrors are empty My computer, as seen by another machine on the LAN: - origin-id is arbitrarily set to eos-production (there *is* a generated summary file) - No remotes exposed in the config file - Refs: - refs/heads/eos/amd64/master (alias of my local refs/remotes/eos- production/eos/amd64/master) - refs/mirrors/eos-apps/app1 (alias of my local refs/remotes/eos- apps/app1) - refs/mirrors/eos-apps/app2 (alias of my local refs/remotes/eos- apps/app2) - refs/remotes is empty - origin-map lists all the origins and their refs except eos- production: - eos-apps: [app1, app2] - refs map lists all the eos-production refs: [eos/amd64/master] The choice of setting the origin-id to eos-production when redistributing from my machine is an OS-specific one: in the EOS case, we were already distributing OS updates over P2P, so we’d need to set the origin-id to that of the OS repository so that the OS refs appear in refs/heads for other, older, machine on the LAN to read. If there were no backwards compatibility concerns, my computer could appear as follows on the LAN, which is a bit simpler: - origin-id is unset (there *is* a generated summary file) - No remotes exposed in the config file - Refs: - refs/mirrors/eos-apps/app1 (alias of my local refs/remotes/eos- apps/app1) - refs/mirrors/eos-apps/app2 (alias of my local refs/remotes/eos- apps/app2) - refs/mirrors/eos-production/amd64/master (alias of my local refs/remotes/eos-production/eos/amd64/master) - refs/heads and refs/remotes are empty - origin-map lists all the origins and their refs: - eos-apps: [app1, app2] - eos-production: [eos/amd64/master] - refs map is empty I hope that makes sense. The other use for having origin-id and origin-map be set in the same summary file, is if a server were the origin for some refs and was also redistributing refs from other origins. I guess maybe that could be used for caching and distributed fault tolerance.
Origin naming scheme --- So that origin IDs can match remote names, they must share the same naming scheme (currently, for example, ‘gnome-apps’). We might want to transition to a different naming scheme (reverse-DNS, for example, ‘org.flathub’) in future if it would make uniqueness easier. In any case, origin IDs have to be globally unique. If this is hard to achieve with free-form IDs, we might instead want to use GUIDs, and match them to local remote configuration by including the origin GUID in the remote configuration as an additional key. This would require a migration step for existing configurations, whereas matching by origin ID = remote name potentially doesn’t, if we assume that most people give their remote configuration a predictable name.I'd prefer to stick and to enforce one scheme to avoid having a situation where we have a mix of conventions for naming the origin ID. Also, anything but GUID. Or maybe we shouldn't care if the name is not going to be used/seen/typed by the user.
The branch I’ve got at the moment uses origin IDs which are like remote names (and matches the two based on that), and it seems to fit into the code fairly well. In the absence of arguments against that, or disasters when I try to integrate this approach into flatpak, I’ll go with that.
New API === For the moment, this will only require new API for resolving and pulling refs over P2P (a very similar API to what is already in my current attempt at https://github.com/pwithnall/ostree/tree/lan-and -usb ). None of the API which deals with local refspecs needs to change, as their semantics remain unchanged. A new version of ostree_repo_remote_list_refs() might need to be created which returns the origin IDs as well as the refs. We’d have to ensure the existing version only returned the refs which that remote is an origin for — the ones listed in the summary file’s main ref map. That should already be the case, so there is no backwards compatibility concern.A bit offtopic here - it is ostree_repo_list_refs. And a minor pet peeve of mine - shouldn't this be named ostree_repo_list_refspecs? It has bitten me more than once, where I thought I got a ref, but really it was a refspec.
Indeed. I’ll try to ensure the new functions are named consistently. I’m going with OstreeOriginRef (as the (origin ID, ref name) tuple) and ostree_blah_origin_ref_blah() for method names which would previously have been ostree_blah_ref_blah(). Philip
Attachment:
signature.asc
Description: This is a digitally signed message part