Re: Summary deltas



On Tue, Feb 18, 2020 at 1:58 AM Alexander Larsson via ostree-list
<ostree-list gnome org> wrote:

So, flatpak is getting rather big these days, and one of the places
where this shows up is the summary file churn. The current flathub
summary file is 5.2 megabyte uncompressed. We serve it content-encoded
gzip at about 1.4 megabyte. However, *any* change to the set of apps
on flathub will cause a completely new summary file which all clients
have to download.

There are some inefficiencies in how summaries are stored (for example
they list all the deltas which you don't typically need). However,
fixing that is a lot more work and an incompatible change. And, there
are much more low-hanging fruit.

Today I did a quick test and ran bsdiff on two consecutive flathub
summary files after a single app got updated:

-rw-r--r--. 1 alex alex 5433240 18 feb 09.29 flathub-1
-rw-r--r--. 1 alex alex 5433240 18 feb 09.29 flathub-2
-rw-rw-r--. 1 alex alex    2111 18 feb 09.30 flathub-1to2.bsdiff

Due to the way summary files are stored (uncompressed, sorted, etc)
they naturally diff very well. It would be very easy for flathub to
store deltas from say the 100 latest summary files to the current one,
which would make the summary update *much* more efficient.

What about putting the summary as a file in the ostree-metadata
commit? You're already generating it and then you can just use static
deltas without inventing any new object types or processing. And it
fixes all the issues of races between fetching and updating of the
detached summary signature file.

For a client to fetch remote metadata in a backwards compatible way,
it would fallback to fetching the summary if the ostree-metadata
commit didn't exist or didn't contain the summary file. On the server
side, you'd have to continue publishing the standalone summary file
for old clients.

--
Dan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]