> To fix this we took a leaf from modern programming language package
> managers. We use the lockfile concept as used by rust's cargo package
> manager (cargo.lock[1]) or nodejs's npm (package-lock.json[2]).
This makes a lot of sense.
Your CI job then is a lot like e.g.:
https://dependabot.com/
Although I think the first time I personally saw the "CI bot updating pinned data"
pattern was in the context of the Cockpit project, which does it for fixed VM images
it uses for testing; a recent example is:
https://github.com/cockpit-project/cockpit/pull/10480
Another thing that's strongly related to this though is that IMO,
classic package metadata (dpkg/rpm) need versioning. And it'd
probably be nice if e.g. crates.io too had a version number one
could reference in addition to the git sha1.
Using that then one could specify e.g. `apt/yum install --repoversion X.Y.Z`
and also have reproducibility. The reason I really want this though
is because for rpm-ostree on the client side, one quickly runs
into the fact that ostree has a very nice git-like history model with
clear checksums and versions, and rpm...has no such thing.
https://github.com/projectatomic/rpm-ostree/issues/415
Actually in Fedora today the "pungi" tool does output versioned
directories: https://kojipkgs.fedoraproject.org/compose/updates/
But it's not an API today and nothing in the libdnf ecosystem understands
how to parse it (there's not an index other than the autogenerated
HTML as far as I know, etc.)
But anyways yeah, pinning the packages makes sense.
Oh, so you're suggesting versioning the entire state of the package archive? Basically applying a version (or a hash) to the entire list of packages and their current versions? Interesting, but it seems mostly useful for the layered package scenario.
In any case, lockfiles is an excellent point. It seems almost obvious in retrospect, but I didn't even think of it.
I'm curious though, how did you deal with non-deterministic post-install scripts in Fedora/Red Hat? For example, I've seen that Fedora's OpenSSH server has some startup scripts to generate host keys at boot time, if necessary; Debian's package is doing that at configure time. I suppose this would be a problem for any immutable-image-style approach, not just OSTree. Did you just go around filing bug reports as you noticed them? Or did it not turn out to be a big issue in practice?
> I would like to extract each deb immediately after downloading into its
> own ostree
Yep, rpm-ostree does this, although sadly right now after downloading
all of the packages, it's not interleaved yet.
> This would also save more disk-space: we'd no longer need to
> store the debs themselves, but could refer to the contents by ostree
> ref e.g. the ref dpkg/data/<sha256> might refer to the deb with that
> sha256. The lockfile has these SHA256s recorded so you'd know which
> ostree refs to use.
> This is of-course a much larger step - you'd still need to handle the
> metadata, pre-inst scripts, etc under control.gz which might be a little
> tricky, but multistrap manages it.
Having a lot of experience with this I can say the benefit and cost
is exactly that: it's a major leap from what apt/yum etc. do today,
with some nice benefits, but one also ends up maintaining a
separate parallel path. Which so far is definitely worth it I think.
My "build process", so to speak, only consists of running multistrap and committing the result at this point, but I've been thinking about ideas like this as well - mostly because of the mention in the docs (turns out that, yes, the compression does make up a significant chunk of the build time, especially if your build process is not doing much else). I was toying with the idea of approaching this as a generic debootstrap/multistrap replacement which just happens to be using libostree internally for caching and for assembling the final tree.
It doesn't even seem that difficult to me. You'd just have to install the package into its own root directory, disentangle the dpkg database a bit to avoid conflicts between packages, and commit the whole thing. Then you could just merge the relevant package trees and reconstitute the package database as necessary. I'm not entirely sure how to handle the preinst script - which is supposed to run before the package is even unpacked - but debootstrap must already be handling that in some way, so it should be possible. This does sound a bit too easy and I'm probably forgetting a bunch of things, but it seems to me that the resulting tree should be entirely self-contained. All that's left is to call `dpkg --configure -a`, do any other post-processing, and commit the result.
I don't know how rpm-ostree handles this "package scripts" stage; I'm pretty sure RPM has some sort of post-install script as well. I can't see any alternative to running them at the end, inside the fully composed system. In the case of Debian, that means you have to e.g. regenerate the locales every time, unfortunately.
Still, it sounds fun. I feel like you could just use OSTree as a distribution vehicle for debs. That being said, this is all stuff I effectively made up, so it might not have any particular connection to reality. ;)