Re: Repo scalability issues and solutions



On Tue, Sep 15, 2020 at 8:42 AM Alexander Larsson via ostree-list
<ostree-list gnome org> wrote:

On Tue, 2020-09-15 at 09:49 -0400, Colin Walters wrote:

One random idea I had last time this was discussed was using ostree
for metadata too; basically the metadata for a repo would be in a
special `ostree-metadata` ref and it would have a filesystem layout
like `/deltas/{00,01,...}/list` where the `list` object would be a
gvariant list.  A lot like exploding our a{sv}s into an ostree
filesystem.  I don't recall where we ended up with that.  It'd
definitely involve more round trips but I have trouble thinking of
any solution that doesn't.

We used the `ostree-metadata` branch for p2p, and I've had nothing but
issues with it. The problem is that modifying a repo like that is very
stateful and global. With a summary file its just a simple download or
atomic mmap which works well for any kind of parallel access to a repo.
But, if you have multiple clients running against a repo and they all
need to update the ostree-metadata ref on-disk, possibly with multiple
versions of the summary (in the p2p multi-peer case) you run into all
sorts of issues with atomicity, races, write permissions in the repo,
etc, etc.

I don't see why there are any more issues with updating the
ostree-metadata ref locally than any other ref or the summary file.
Why is it different than multiple clients fetching and storing the
summary file or any other commit to the local repo? Definitely the p2p
code caused a lot of headaches and probably needs to be redesigned
(preferably in the ostree project context), but I think you already
yanked the p2p code from flatpak, right? And if you want, you could
definitely just fetch the ostree-metadata commit object without saving
to disk in exactly the same way you can with the summary file. There's
nothing magical there except that the current pull code doesn't do it.

That said, there are 2 major issues with this approach. A major
benefit of using a commit for the metadata is to be able to make
static deltas of the data to cut down the download size. However, the
commit object (i.e., the commit metadata) is not deltad, so you'd have
to move the interesting information into a file object in the commit.
That's not the end of the world, but if the list of deltas lives in
the ostree-metadata commit object, then you don't know what delta to
fetch. You'd have to do something like a commit metadata only fetch,
get the list of deltas out of there and then pull the delta to get the
commit contents.

--
Dan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]