Re: Validating hash of .filez objects



Hi,

On 20.03.2017 18:19, Colin Walters wrote:
On Mon, Mar 20, 2017, at 12:04 PM, Phil Wise wrote:

I'm trying to understand how the hash of filez objects is calculated in
the OSTree repo.

See https://github.com/ostreedev/ostree/blob/master/docs/manual/repo.md#content-objects

For content objects, having a matching object hash means we can safely
create a hard link.

Let's look at a counterexample; if two files have different owners, Unix permissions or
e.g. SELinux security label xattrs, we can't safely hardlink them.

This does in turn imply that a file changing mode or security label means we copy it.
This falls out of a core design principle of OSTree; all it needs is hardlinks, which
exist in every useful Linux filesystem.  We don't require CoW.

Ah, I hadn't spotted the implications of the hard linking behavior
before: that because xattrs are attached to an inode rather than
direntry and equal hashes imply hardlink-ability then the hash of a file
needs to cover more than just its contents.

And honestly, the case of files changing mode or label is pretty rare enough
IME to not need specific optimization.

The actual *code* in libostree doing this is a bit overly-complex, the format looks like
<header size><header><content>
See e.g.:
https://github.com/ostreedev/ostree/blob/master/src/libostree/ostree-core.c#L385

I guess this is because filez is gzip'd and the hash is calculated over
the uncompressed contents, but I'm having trouble working out an
incantation that can decompress it.  Do you have some pointers to how
this works, please?  I've tried zlib decompressing all suffixes of the
file (to skip a header) but with no luck so far.

My end goal is to check on the server that objects are correct after
they have been uploaded.

`ostree fsck` knows how to do that.  And if you're using `ostree pull`
to copy content, it will also verify the checksums.

If you just want to do what ostree would do, try:
`ostree checksum <file>`.

```
[root@icarus ~]# ostree checksum /usr/bin/bash
6ec3a62260b87f7fcd4cfbb651f45ee65e83e86fb9da3eff69db11a7ec37bf9b
[root@icarus ~]# ls -ali /usr/bin/bash 
/ostree/repo/objects/6e/c3a62260b87f7fcd4cfbb651f45ee65e83e86fb9da3eff69db11a7ec37bf9b.file 
331958703 -rwxr-xr-x. 3 root root 1072008 Dec 31  1969 
/ostree/repo/objects/6e/c3a62260b87f7fcd4cfbb651f45ee65e83e86fb9da3eff69db11a7ec37bf9b.file
331958703 -rwxr-xr-x. 3 root root 1072008 Dec 31  1969 /usr/bin/bash
```

Ideally I would like to be able to check this on a file-by-file basis as
the objects are uploaded, i.e. answer the question 'is this newly
uploaded file 6ec3a..5gdd.filez self consistent?'  I'll have a browse of
the code around ostree fsck: that must be doing what I'm after.

Thank you

Phil

-- 
Phil Wise, ATS Advanced Telematic Systems GmbH
Kantstrasse 162, 10623 Berlin
Managing Directors: Dirk Pöschl, Armin G. Schmidt
Register Court: HRB 151501 B, Amtsgericht Charlottenburg


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]