Re: OSTree and OCI images



On tor, 2016-11-17 at 15:41 -0500, Colin Walters wrote:
(Email is better for architecture discussions, so moving back here)

On Thu, Nov 17, 2016, at 05:10 AM, Alexander Larsson wrote:


Really the only flatpak specific thing is the mapping from flatpak
arch
strings to golang/oci arch strings.

Hum, maybe we can just have "hook" functions that allow the caller
to tweak the content.

I'm not sure this will need to be addressed at all in the ostree layer,
we'll have to see.


Yeah, I've started extracting this. The first part is ostree
support
for the Docker/OCI layer format (i.e. whiteout handling and file
overriding), which has a PR here:

https://github.com/ostreedev/ostree/pull/578

So can we hash out on the list here - what are the advantages
and disadvantages of this "flattening" approach?

The atomic command right now represents each OCI layer
as an ostree ref.  When we want to deploy a container,
we compute the filesystem tree at checkout.

In this model OSTree isn't really involved at the transport
level at all, and we don't support image reassembly (bit for bit),
since Docker/OCI basically make that impossible[1].

I'm trying to understand what you mean by
" However we run into layers as we start importing apps from
   oci images which may have multiple layers that need to be
combined."

Can you flesh this out a bit more?  What Docker/OCI app would
be imported into flatpak?  And for this use case, it's OK if
we discard the layering structure?  I.e. we don't have a use
case of wanting to pull just modified layers?

So, lets start with the basic origin of this. OCI images, and the
standard OCI image directory layout over HTTP is likely going to be an
important standard for distribution of content, be it actual OCI
containers, data images, or whatever. As people/organizations have
infrastructure for these set up it makes sense to support that in
things like flatpak and ostree.

For instance, you might have a build system that produces an OCI image,
because that is how your organizations build system works. But then you
want to deploy that image as the root-fs on your system that uses a
plain ostree approach. For such a setup it seems like like a very
useful operation to be able to specify a remote OCI repo + image
name/tag and then have it pull that into your local ostree repo.

There are two ways that can work. Either we pull each layer of the OCI
image to its own branch, and then somewhere we put the OCI-level
manifests data somewhere (its own branch? commit metadata?). Then we
can emulate the OCI process when we check out, applying each layer in
progress.

The other approach is to apply all the layers during the commit, ending
up with a flattened single branch.

Which of these two approaches you want to use depends on how you will
be using the data in the end, but each has its advantages:

* Flattened storage
 - This is the native ostree model, the consumer need not be aware
   that the origin of the image was layered. You don't need to parse
   some metadata and know which branches to checkout and in what order
   to apply them, or when it is safe to remove some layer during 
   pruning.

 - If any files are removed or replaced in the layer stack this gets 
   squashed, so the final resulting image is smaller.

* Layered storage
 - If you're storing many images the layers can be shared between 
   images. Note, this is not as important as you imagine since the file
   *content* will always be shared in ostree. However, ostree objects 
   like dirmeta and dirtrees can be shared that otherwise would not.

 - Storing layers individually means it is more efficient to download
   multiple images if they are using a shared base layer.

 - Storing the layers separately means we're closer to the original
   images, should we ever need them back. In fact, if we store the
   image json file and original tar-headers for each individual layer
   we can reconstruct the original OCI images. (Recompression will
   make the reconstructed manifests be different due to different
   sha256, but the image itself refers to the uncompressed tar sha256).

Another important operation is pushing an ostree branch into an OCI
registry, which may be important if you have an ostree client but your
organizations distribution mechanism of choice is OCI repos. You can
then push an ostree branch to a OCI image, adding enough commit
metadata to the image so you can then perfectly reconstruct it again.

Now, when would you use each of these?

I don't think OCI reconstruction is all that important, and the sharing
is likely to be better overall in an ostree storage, so the main
benefit of layered storage is the download size when downloading
directly from an OCI registry. In other words, if OCI is used inside
your tooling somewhere, but in the end it ends up in an ostree repo
where your clients pull from it, then using a flattened approach is
best, but if your clients are pulling from an OCI registry, then a
layered approach might be better.

In the case of flatpak, we're currently storing the apps as single
branches in the local ostree repo, and to avoid lots of extra
complexity I'd like to keep it that way. This means it will download
each layer individually and merge them during the commit. It sounds
like this may be less efficient when downloading from an OCI registry,
but in practice flatpak apps are not likely to be layered due to the
runtime/app split and the fact that they are generally not built in a
layered fashion (i.e. not via docker files).

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
       alexl redhat com            alexander larsson gmail com 
He's a benighted hunchbacked firefighter searching for his wife's true 
killer. She's a time-travelling foul-mouthed soap star with an incredible 
destiny. They fight crime! 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]