Meta Data Monkey Wrenches



I'd like to point out a potential problem and make a suggestion.  The post is
quite long, as I got to rambling a bit
and wasted (?) my lunch hour.

I don't know what other people use/intend to use metadata for,
but for me type/kind/creator/icon is the least of it.  I like to attach notes to
things.  The Mac supports *some* of this.
There is a 'comment' block that I can add information to in the GetInfo panel,
and I can colorize my icons to flag certain things.  I don't think that this is
particularly a function of the Mac filesystem, since the information is easily
lost by rebuilding the desktop file (read: database).  This is supposedly fixed
in the newest versions of the OS.

There are several nifty things that get done with this on the Mac...one of my
favorites is a Netscape thing.  If I see a jpeg that I like and rip it off of the
web page and drop it into a folder (gotta love that) I can later go back and do a
Command-I (GetInfo) and the URL is stuffed into the comment block.  I sometimes
use this instead of bookmarks, since people neglect to name their pages and the
subject of a URL is sometimes difficult to discern.  A picture is worth a
thousand words, they say.  I've got a folder of jpeg's and gif's whose icons
are thumbnails...I have sort of a visual memory...makes it easy.

But I digress.  Point is, there are a lot of uses for MetaData, and there will be
more that have not yet been anticipated.  Even the Macintosh provides poor,
single user-oriented, support for MetaData.  There's the first
Monkey-wrench...Unix in almost any flavor is *not* single user.  This guarantees
the need for a database of sorts, even if it is flat-file or directory tree
based.

Here's the second Monkey-wrench to be thrown into the works:  GNOME purports to
be Network-Aware.  This makes the document space (already convoluted with
symlinks) no longer limited to the filesystem.  The most wonderful feature of GMC
is the VFS
support...I'd like all Gnome apps to be able to take advantage of that.  I'd also
like to have my MetaData with objects in those VFS spaces.  Then, let's not
forget Web pages, etc.  As
for convoluted:

/afs/rose-hulman.edu/Users/faculty/slover/Public/HTML/index.html
/home/slover/samba/Public/HTML/index.html
ftp://slover@bogus.rose-hulman.edu/Public/HTML/index.html
http://www.rose-hulman.edu/~slover/index.html

These are just 4 possible paths to my web page.  There are several more.  This
might be a poor example, since it is pretty easy to write a regexp that would
identify ~/index.html and those other derivatives as my home page.  Other
documents in other paths would not necessarily be too easy.

My points:

1) Anything that depends on the filesystem to support storing
   this information is broken from the start, as the document
   space is greater than the local file system.

2) Storing the meta-content local to the file within the
   filesystem is troublesome at best, and is a single-user
   approach.

3) GNOME is out in front on this one.  Everything else is still
   tied to the single-user, objects-occupy-a-single-namespace
   paradigm.  If the GNOME resolution is useful, portable,
   and simple, it stands a chance of being *the* resolution.

My thoughts:

I think that the local-content should be nothing more than an identity
attribute...sort of an 'object key'.  Ideally, it should live *inside* the file
(for new filetypes, or old ones that support comments or tags) where it cannot
get disassociated.  Less ideally, it should live locally to the file (the .info
approach).  Least ideally, perhaps regexp's and hashes.  I think all should be
supported, with the knowledge that only the first is shell and filesystem 'safe'.

An object-key cache is probably appropriate to keep around, to
save opening a file each time the file browser wants to find
the object key to reference an icon.

My thinking is that this key should be something similar to the DCE UUID.  Code
is available on the OSF web site.

Related thoughts:

Once an object has been given a 'signature' or object-key, no new key need ever
be assigned.  There can be a per-user database and perhaps a system wide database
(and an internet-wide database?) which stores meta-information.  I also get the
gut feeling that the bulk of interesting file types are documents and scripts
which will in some form support embedding a UUID string.  I read some comment
about 'modifying the file contents scares the bejesus outa me' and I'd agree.
But including a UUID tag or comment in a .tiff or .xpm when ee saves it is
something I would consider sane to do.  If we provided a 'create_uuid'
command-line utility and corresponding applet to manufacture ID's upon request,
I'd certainly use one or the other and paste the results into a comment in my
scripts, making them meta-capable.

For some things, I *would* consider it safe to write a script to add these object
keys automagically...PDF's, Postscript Files, certain scripts, for instance.
It'd sure be nice to browse through the 80 or so off-the-cuff Perl scripts in my
~/tools directory with GMC and be able to write (and later read) a short comment
describing what they do instead of grepping through or opening each one every six
months while thinking, "I know I have an ASCII to EBCDIC converter here
_somewhere_".

Regardless, on my system at least, the files I move around most
are the 'interesting' ones: documents, scripts, web pages, graphics.  These, save
for some graphics formats, can all support comments in some way internally.  The
interesting ones that don't let me alter their structure too much tend to live in
static places and regexp's are probably good enough:
'/usr/local/etc/httpd/logs/access.log', '/var/log/messages'.

I'd also then consider it sane to submit a patch to the Apache guys that allowed
Apache to use the Gnome libraries to extract the UUID if it exists in order to
add a X-Meta-UUID: mime tag to everything it serves up...maybe this would
eventually make it into an HTTP spec.  Then it'd be nice to see that the comment
I attached to someone's URL is still associated with it later when I encounter it
at a mirror or somewhere in AFS-land, or simply when the web page moves.

Anyway, just some thoughts.

--Robert

P.S.  I've been away from the Gnome list for over a week, and
      owe some e-mails to some others.  I've not fully digested
      everything that is going on....I saw some discussion
      between Elliot and Miguel regarding Orbit and network
      dynamism.  I'd suggest that perhaps (don't know) there
      are DCE related papers on the subject with possible
      solutions?  I seem to remember reading something a while
      back about 'end point mapping' and how to handle it when
      an object (machine, person) moves between cells...






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]