Re: Repo scalability issues and solutions



On Tue, 2020-09-15 at 09:49 -0400, Colin Walters wrote:
Sorry about the delay here.

On Tue, Aug 25, 2020, at 8:38 AM, Alexander Larsson via ostree-list
wrote:

Does the above make sense to everyone? Do we have any other ideas
how
we could do better? Do we have some important feature we would like
in
the new format?

Note: While some of these changes apply to ostree, some apply just
to
flatpak. However, I want to synchronize the changes so that we only
have to do a single format-change.

Right, implicitly here you mean "ostree for operating systems" and I
think that's a core tension here: what one wants for operating
systems w/ostree is quite different in scale and mechanics from
applications.

It's not clear to me actually that *all* of the logic around
applications needs to live in ostree itself - for example, flatpak
can define its own higher level metadata and tooling.  Now there are
clearly tradeoffs here - for example, it might mean that local
mirroring requires using some flatpak tooling on the server side too
(but you're already building that right?).

My mail kinda mixed up things that would be part of flatpak and part of
ostree. Obviously the flatpak specific metadata would be created and
consumed by flatpak. However, they are somewhat related as we can use
the existence of a new ostree summary format to trigger the new flatpak
formats, thus making the old backwards compat files old ostree and old
flatpak while the new ones are new+new.

Some of your proposed changes make total sense, but I think given the
possibility to "greenfield" things here more it's probably worth
thinking about higher level and more long-term changes.  We don't
want to accumulate too many formats because the pull code is already
incredibly complicated and under-documented.  Data formats are hard.

As I've been slowly implementing these things over the last weeks I've
ended up simplifying what changes we do on the ostree side. Currently
I'm ending up with these changes to the actual summary file format:

 * Split out deltas into their own per-target-commit index file
 * Add new shorter key name for per-ref `ostree.commit.timestamp`
metadata
 * Add a version key in the metadata

These are pretty simple optional features that doesn't really change
the code that accesses summary format much.

Then I've created a summary index, that basically is a summary of
summaries, one default one with all the refs, and optionally some
partial summaries for named subsets. These summaries are accessed by
checksum, so that they cache well, can easily be delta:ed, etc.

This does somewhat complicate the pull logic, but not that much really
(also the work I've been doing has been cleaning up the pull code a
bit). I'll try to finish the branch with this work tomorrow and get it
in a state where it can be reviewed and discussed. 

One random idea I had last time this was discussed was using ostree
for metadata too; basically the metadata for a repo would be in a
special `ostree-metadata` ref and it would have a filesystem layout
like `/deltas/{00,01,...}/list` where the `list` object would be a
gvariant list.  A lot like exploding our a{sv}s into an ostree
filesystem.  I don't recall where we ended up with that.  It'd
definitely involve more round trips but I have trouble thinking of
any solution that doesn't.

We used the `ostree-metadata` branch for p2p, and I've had nothing but
issues with it. The problem is that modifying a repo like that is very
stateful and global. With a summary file its just a simple download or
atomic mmap which works well for any kind of parallel access to a repo.
But, if you have multiple clients running against a repo and they all
need to update the ostree-metadata ref on-disk, possibly with multiple
versions of the summary (in the p2p multi-peer case) you run into all
sorts of issues with atomicity, races, write permissions in the repo,
etc, etc.

Honestly I don't think this is a good approach.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]