Re: [Summary] Meta-data/filesystem-encapsulation

From: Christopher Curtis <ccurtis ee fit edu>
To: Kevin Littlejohn <darius connect com au>
cc: gnome-list gnome org
Subject: Re: [Summary] Meta-data/filesystem-encapsulation
Date: Tue, 18 Aug 1998 00:32:30 -0400 (EDT)
On Tue, 18 Aug 1998, Kevin Littlejohn wrote:

> What about the case of daemons rotating logfiles?  Or daemons that create

Well, *if* the data were with the file (in its inode) a 'mv' wouldn't harm
that data.  In most situations though, yes, this daemon would need the
library preloaded.  The only other option is to have the data orphaned.

If you simply want the newest file to have a particular association and
the oldes files to retain similar (to each other) but different (from the
newest) settings, this is best done with a regexp type database.  More on
this below...

> many files (incoming ftp server, perhaps *shrug*).  I want to be able to
> assign a particular icon to new, incoming files off that ftp server - but

The simlest way to do this is have those file have a 'null' (or default)
icon, and have processed files have a 'processed' icon.

> from.  I also don't fancy, even on the boxes I _do_ have root on, having
> to preload this library all the way through - if my desktop database breaks,
> my OS might not boot up?  That doesn't seem too logical to me...

Well ... LD_PRELOAD does not work for SUID/SGID scripts - it may or may
not work for the root user - I'm not too sure.  However, if you want a
daemon to use the preload, set the LD_PRELOAD in that daemon's
environment.  I would never suggest setting it in the root environ.

> I'm also opposed to embedding information into the files, for a number of
> reasons - it's difficult to clean out if I decide I don't like gnome

That's not true at all -- I sure like this find command:

find / -metadata \* -exec ResetGnomeData {} \;

> anymore, it runs the risk of stuffing other non-gnome-aware programs up
> if gnome guesses the filetype wrongly, and it's just plain too much tinkering.

It is a lot of tinkering.  I think it's worthwhile though...  About
stuffing up non-Gnome aware programs, I just don't know what you mean.
Putting the metadata with the raw data won't affect those programs at all.

Let's look at the inode struct for ext2 ... It has, among other things,
this:

        union {
                struct {
                        __u32  l_i_reserved1;
                } linux1;
                struct {
                        __u32  h_i_translator;
                } hurd1;
                struct {
                        __u32  m_i_reserved1;
                } masix1;
        } osd1;                         /* OS dependent 1 */

I suspect this is already being used (especially by the hurd) but there's
an OS dependent 2 structure in there as well.  Now, imagine that
indode.osd1.linux1 contins a pointer to another block, which is in reality
another inode just like the real data inode, except it contains only
metadata information, exactly as it would appear if it were an entry in a
non-integrated metadata database.  Standard read()/write() calls will only
see the real data - the GNOME libs will have to look at this structure
specifically, and address the metadata inode directly (or convince the
ext2fs driver to do it).  I'm not familiar with that low-level
programming.  It *is* very low level, but should not require rewriting the
driver (because, after all, we have the source to ext2 and can do anything
it can do).

> It also shares a problem with the next option - how do I assign metadata
> to files over which I have read-only permissions?  'cause I'm gonna want

Each user will have to have their own 'preferences' database, even if my
integrated filesystem approach is used.  So it's clear (I hope), my
suggestion to integrate the data into the fs does not alleviate any of the
other strains GNOME metadata storage people will face - it is simply an
alternative to a global, instance-specific database.  We will still need a
global mime.types for classes of information.  We will still need a
userlevel preference database.  What we won't need is a global "registry" 
like Windows/OS2.  What we will gain is the ability for the owner of a
file to embed any data (s)he wishes without having to worry about other
people mucking it up.  Example: An "Author" tag.  Sure, you can figure out
the author by the UID - until /etc/passwd gets nuked (it happens). 
However, you don't want just anybody to be able to override this tag and
claim it as their own.  So not only do you need metadata, but you need
metadata permissions for each tag.  If you don't store this in the file
itself, it's a whole other level that you have to impose on top of a
global registry - a whole new kernel protection scheme if you will, just
for metadata.  In the file, the kernel will handle it.

> Which leaves the extended attributes - the other case, which only applies
> on certain filesystems, and suffers the above file permissions problems.

I've lost the gist of what you were saying, but I think I just repeated it.

> > *I* never suggested that a preload alone would suffice in all cases.  I
> > explicitly said (several times) that an alternate would always be needed.
> > Preloads simply make it harder to unintentionally orphan data, which I
> > think is a Good Thing (tm).
> 
> The problem is, once you accept preloads, you start bringing embedded
> metadata storage into the picture (under the assumption that everything
> will be able to cope, because everything will use the library).  If everything
> is _not_ using the library, then embedding information is dangerous -
> and if that's the case, the preload scenario _doesn't_ gain you anything
> _in terms of design_.  Once we've got a core library, then you could easily
> produce a preload-capable wrapper, and preload to your hearts' content on
> your own system - but if the design of the metainfo database assumes that
> we can gain this sort of coverage, we've got problems.

I think you misunderstand.  LD_PRELOAD has nothing to do with fs
integration.  It has only to do with data preservation.  I don't care if
the data is in the fs or in a database - it can be as easily orphaned
either way.

But as I said, yes, if you do embed the data and then lose it, it will
show up as a "lost chain" in the filesystem.  This will be bad because it
wastes disk space.  However, if this does happen and you fsck the drive,
these chains will reappear in /lost+found no?  Then, imagine this: when
GNOME boots, give it a "-recover-directory:" parameter where it will scan
these lost chains and reattach them to their parent objects.  This can be
done if a reference to the original object is kept with the EA data, and
will be about as effective as Win95's "shortcut resolution".  That's about
the best you can do once the data has been orphaned.  Another beauty thing
of this is that since this data file is intact, you can move it as a
single entity, and GNOME should be able to read it as one, and then
re-integrate it into whatever database system is used, be it flatfile, fs
integrated, or remote daemon.  How slick would that be?

> *nod* We're in agreement here - the only thing that's keeping this running
> is the reference to preload, which I think clouds the issue of database
> design.  Consider preload as an added extra that might appear sometime
> further down the track as a wrapper, and we're both happy :)

Sure - the two things have nothing to do with one another, except that a
preload will help to ensure the consitency of the database from non-GNOME
aware apps.

> Almost - I think this is the next discussion - my preference is heavily

This should be short as long as you don't balk at me wanting to use spare
data structs inside ext2.  :)

> with a personal database + system database, rather than trying to get the

These two will always be needed.  In fact, more will be needed as there
should be at least two system databases - mime.types, and then the
file-specific database (which I want to be part of the fs, using the fs
as the database, if you will).

> metadata 'near' the file - I _hate_ having .* files everywhere, and I
> think where you put the data is actually near-irrelevant AFA how likely
> it is to be orphaned - if you use the metainfo library, it won't be orphaned,

As do I hate .link-to files.  These make it even more unreliable in my
opinion.  Easier to recover when things go wrong, no doubt, but very
hokey.  If the db is abstracted, both can be used.  .files can be used
when testing it (for easy recovery) then a flatfile can be used for
speed/reliability once the core logic is debugged.

> I also think you dodge the issues of 'whose metadata is it', and 'can I
> assign metadata to a file I can't write'.  The _big_ problem is, how do you
> export that data to other systems?  Maybe we need a 'gnome-file' type, that
> contains meta-info in a wrapper around the file itself... *shrug*

Getting the data to other system _is_ a problem.  My initial solution was
to modify NFS to send the metadata stream to a client that requests it but
nobody seemed to like that.  The only other solution that I see offhand is
a GNOME attribute daemon to run alongside the NFS daemon (ala xfs).  (This
daemon could be anything - from a custom app to a SQL server).  That's
what I talked about when I spoke of a NFS server with hidden data.

I hope Miguel is still listening.  =)

--
Christopher Curtis               - http://www.ee.fit.edu/users/ccurtis
                                 - System Administrator, Programmer
Melbourne, Florida  USA          - http://www.lp.org/
Follow-Ups:
- Re: [Summary] Meta-data/filesystem-encapsulation
  - From: Kevin Littlejohn
References:
- Re: [Summary] Meta-data/filesystem-encapsulation
  - From: Kevin Littlejohn
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]