Meta Data Monkey Wrenches



I'd like to point out a potential problem and make a suggestion.  The
post is quite long, as I got to rambling a 
bit and wasted (?) my lunch hour.

I don't know what other people use/intend to use metadata for,
but for me type/kind/creator/icon is the least of it.  I like to attach
notes to things.  The Mac supports *some* of this.
There is a 'comment' block that I can add information to in the GetInfo
panel, and I can colorize my icons to flag certain things.  I don't
think that this is particularly a function of the Mac filesystem, since
the information is easily lost by rebuilding the desktop file (read:
database).  This is supposedly fixed in the newest versions of the OS.

There are several nifty things that get done with this on the Mac...one
of my favorites is a Netscape thing.  If I see a jpeg that I like and
rip it off of the web page and drop it into a folder (gotta love that) I
can later go back and do a
Command-I (GetInfo) and the URL is stuffed into the comment block.  I
sometimes use this instead of bookmarks, since people neglect to name
their pages and the subject of a URL is sometimes difficult to discern. 
A picture is worth a
thousand words, they say.  I've got a folder of jpeg's and gif's whose
icons are thumbnails...I have sort of a visual memory...makes it easy.

But I digress.  Point is, there are a lot of uses for MetaData, and
there will be more that have not yet been anticipated.  Even the
Macintosh provides poor, single user-oriented, support for MetaData. 
There's the first Monkey-wrench...Unix in almost any flavor is *not*
single user.  This guarantees the need for a database of sorts, even if
it is flat-file or directory tree
based.

Here's the second Monkey-wrench to be thrown into the works:  GNOME
purports to be Network-Aware.  This makes the document space (already
convoluted with symlinks) *no longer limited to the filesystem*.  The
most wonderful feature of GMC
is the VFS support...I'd like all Gnome apps to be able to take
advantage of that.  I'd also like to have my MetaData with objects in
those VFS spaces.  Then, let's not forget Web pages, etc.  As for
convoluted:

/afs/rose-hulman.edu/Users/faculty/slover/Public/HTML/index.html
/home/slover/samba/Public/HTML/index.html
ftp://slover@bogus.rose-hulman.edu/Public/HTML/index.html
http://www.rose-hulman.edu/~slover/index.html

These are just 4 possible paths to my web page.  There are several
more.  This might be a poor example, since it is pretty easy to write a
regexp that would identify ~/index.html and those other derivatives as
my home page.  Other documents in other paths would not necessarily be
too easy.

My points:

1) Anything that depends on the filesystem to support storing
   this information is broken from the start, as the document
   space is greater than the local file system.

2) Storing the meta-content local to the file within the
   filesystem is troublesome at best, and is a single-user
   approach.

3) GNOME is out in front on this one.  Everything else is still
   tied to the single-user, objects-occupy-a-single-namespace
   paradigm.  If the GNOME resolution is useful, portable,
   and simple, it stands a chance of being *the* resolution.

My thoughts:

I think that the local-content should be nothing more than an identity
attribute...sort of an 'object key'.  Ideally, it should live *inside*
the file (for new filetypes, or old ones that support comments or tags)
where it cannot get disassociated.  Less ideally, it should live locally
to the file (the .info approach).  Least ideally, perhaps regexp's and
hashes.  I think all should be supported, with the knowledge that only
the first is shell and filesystem 'safe'.

An object-key cache is probably appropriate to keep around, to
save opening a file each time the file browser wants to find
the object key to reference an icon.

My thinking is that this key should be something similar to the DCE
UUID.  Code is freely available on the OSF web site.

Related thoughts:

Once an object has been given a 'signature' or object-key, no new key
need ever be assigned.  There can be a per-user database and perhaps a
system wide database (and an internet-wide database?) which stores
meta-information.  I also get the gut feeling that the bulk of
interesting file types are documents and scripts which will in some form
support embedding a UUID string.  I read some comment about 'modifying
the file contents scares the bejesus outa me' and I'd agree. But
including a UUID tag or comment in a .tiff or .xpm when ee saves it is
something I would consider sane to do.  If we provided a 'create_uuid'
command-line utility and corresponding applet to manufacture ID's upon
request, I'd certainly use one or the other and paste the results into a
comment in my
scripts, making them meta-capable.

For some things, I *would* consider it safe to write a script to add
these object keys automagically...PDF's, Postscript Files, certain
scripts, for instance.

It'd sure be nice to browse through the 80 or so off-the-cuff Perl
scripts in my ~/tools directory with GMC and be able to write (and later
read) a short comment describing what they do instead of grepping
through or opening each one every six
months while thinking, "I know I have an ASCII to EBCDIC converter here
_somewhere_".

Regardless, on my system at least, the files I move around most
are the 'interesting' ones: documents, scripts, web pages, graphics. 
These, save for some graphics formats, can all support comments in some
way internally.  The interesting ones that don't let me alter their
structure too much tend to live in static places and regexp's are
probably good enough:
'/usr/local/etc/httpd/logs/access.log', '/var/log/messages'.

I'd also then consider it sane to submit a patch to the Apache guys that
allowed Apache to use the Gnome libraries to extract the UUID if it
exists in order to add a X-Meta-UUID: mime tag to everything it serves
up...maybe this would eventually make it into an HTTP spec.  Then it'd
be nice to see that the comment I attached to someone's URL is still
associated with it later when I encounter it at a mirror or somewhere in
AFS-land, or simply when the web page moves.

Anyway, just some thoughts.

--Robert

P.S.  I've been away from the Gnome list for over a week, and
      owe some e-mails to some others.  I've not fully digested
      everything that is going on....I saw some discussion
      between Elliot and Miguel regarding Orbit and network
      dynamism.  I'd suggest that perhaps (don't know) there
      are DCE related papers on the subject with possible
      solutions?  I seem to remember reading something a while
      back about 'end point mapping' and how to handle it when
      an object (machine, person) moves between cells...



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]