Re: Static object deltas plan



Let me try one other way to say this too: OSTree static deltas have a
strong split between the diff *format* and the diff *generator*.
Because there are different use cases which demand different tuning
options.

For example, one case is where we're doing online continuous
integration, and the users have high-bandwidth internet connectivity.
Here we care a lot about *latency* of diff generation.  This is where
xdiff hurts - it tries too hard to be minimal.  For this type of use
case, we might do:

$ ostree gendelta --tune=continuous

And the diff compiler might avoid expensive bsdiff/xdiff style string
computations.  It might even use gzip -1.   A static delta could
literally just be the moral equivalent of 
"tar newfile1 newfile2 | gzip -1".  This type of diff should not add
more than say 5-10 seconds to latency on top of the regular build.



But on the compete other side of the pendulum, for the case where we're
generating a delta between major version updates, and shipping that to
hundreds of thousands of clients, some with 3g or lower level of
internet connectivity, we very likely want to spend substantial amount
of build server CPU time to minimize network bandwith.

$ ostree gendelta --tune=major

This might tell the delta compiler to use bsdiff as much as possible, to
try compressing data files with "xz --best", to try very hard to find
object matches (think even to the extent of pulling bits
from /usr/lib/libgtk-2.0.so to generate the new /usr/lib/libgtk-3.0.so).
This sort of thing could easily occupy a set of fast Intel cores for
*days*, and that's assuming a very good algorithm for searching the
operation space.






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]