Re: [Summary] Meta-data/filesystem-encapsulation



On Fri, 14 Aug 1998, David Jeske wrote:

> ** Solution Proposed by David Jeske (jeske@chat.net):
> 
> I think we should start with a system much like NeXT used, because
> they had a similar goal. That is, they wanted to put an
> encapsulation/resource system on top of UNIX with minimum distruption
> and maximum gain. Another support for this is that GnuStep already
> does it (although this isn't enough by itself).
> 
> If you don't undestand the explanation below, please email questions
> to me directly, I'll explain them to you and include the description
> in a future summary.
> 
> 1) use extensions for type information (of both files and directories)
Cool. Get a Win95 box, click on the filename below the icon, delete the
last character, and watch like the icon changes, etc.
Very cool, and very intuitive. Even better with MSDOS discs (which are
used to interchange data even today), and it's 8.3 name space.
I see it:
SomeUnixBox:
# cp somepic.tiff /mnt/floppy
LinuxBox:
# cp /mnt/floppy/* .
Oops, it's somepic.tif, and if your are unlucky, it could be somepic.ti1,
...

> 2) encapsulate all 'multi-file' items on the system into a directory.
But it does not solve the problem with existing file types. For example
take an TIFF file. Normally, gimp would be the editor for this. But in
the case that the TIFF files contain some scientific data represented as
an image, it should launch the ``ScientificImageEditorAndViewer''.

And for compability with existing tools like this, you cannot do
mv filename.tiff tmp
mkdir filename.tiff
mv tmp filename.tiff/main_item

This would break all ``ancient'' apps.
(This is perhaps ok for NeXT, but X11 comes with many legacy apps.)

> 3) require apps and code to use relative paths!
> 4) make a type/icon/app server which will keep track of things like
>    applications to launch for a type, extension -> MIME type mappings,
>    BUT...
> 5) have apps/wrappers passively export type information (i.e. no
>    configuration files which have to be hacked when an app is installed)
>    In other words, the 'type/icon/app server' should just be a _cache_ of
>    information which is pulled from every app-wrapper. 
The problem here, is that it is application centric, not document centric
(or call it oo). It's not really metadata you are talking about, it's
rather application resources.

> The tradeoffs of this are
>  - we can't easily do BeOS style 'app preference per file'
>  - types are defined by extensions (some don't like this)
>  + types are defined by extensions (this is how it is on the internet)
Nope. This is not done so. If you mean by Internet the WWW, than I'd like
to tell you something about http and the Content-type: field in every
response:)
If you mean the Internet at large, than you cannot say something like
this, as Macs, Win32, Unix boxes, etc., which all do it their own way. :)

(By the way, it's Internet for this big network. internet is a
 distributed systems term describing the fact of connecting two or more
 networks. So connect to LANs over the phone line, and you got an
 internet.)

>  + apps can be moved around
>  + old UNIX programs (like cp, mv, etc) can still deal with the wrappers,
>    because they are just directories
Yes and no. It's basically the same as having filename.metadata, because
so or so the user has to notice. (Ever tried to mv'ing a dir over fs
boundaries? Do you know cp: /usr/bin: directory ignored?)

>  + apps can be truly atomic, because they can store any files they want in
apps and atomic? What do you mean by atomic?
>    their directory
>  + we can provide a way to allow old school UNIX to launch new apps
>    without stuffing them all in the PATH.
This doesn't have anything todo with your proposal, this is a general
conclusions. If we can store metadata for a file ``f'', then we can write
a program that reads this metadata, and does the right thing to open the
file.

>  + shell scripts can be installed and run without hacking their "#!" lines.
>  + an app can be tarred up, copied to another machine, untarred, and it'll
>    'just work' as if it was installed there. 
The problem here is, that you need two current directories for this to
work:

Consider:
MetaApp MetaDoc

When you want MetaApp to find it's components relativly, than you have
to extend MetaDoc to an absolute name. And vica versa.

> Examples/Information about the points above:
> 
> 3) by requiring apps to use relative paths to get to their components,
>    we allow administrators to have control over where apps are installed.
>    In addition to other issues, this allows things like:
>     - user installing apps in their own directories
>     - multiple versions of the same app
>     - pre-compiled binaries to be installed anywhere
Again ;) You are talking about Apps.
We are talking about Documents/Objects. (Remember what the O in GNOME
stands for?)

> 6/7) by having UNIX shell 'app launching' go through this new system, we
>      can solve problems which have existed forever with launch apps on old
>      UNIX. For example:
This is a general property of any metadata system. And it's actually not
easy to implement with your ``directory'' approach, as it leaves out any
existing standard file types.
>     - we can create a way to unify the different mechanisms to configure
>       app launching. Currently in old school unix, there are at least
>       three ways to decide what apps to launch:
>         - environment variables (EDITOR)
>         - configuration files (custom for every app)
>         - hard coded (i.e. scripts with #!/usr/local/bin/perl)
> 
>       We would be able to both improve/replace all of these (for GUI apps)
>       and remain backward compatible with them. For example, we could
>       map "#!<name>" to our app launcher, where the admin/user could
>       configure what app was launched. The app server would figure out
But again, the problem, that this depends upon the document instance and
not even upon the Document class.

To make a ``class-only'' based system work, one would have to allow to
rename file extensions transparently.
So .html_o are my own html files, and .html_dl are downloaded html files,
...

>       where that app was located. We would be able to do things like
>       "set EDITOR=///EDITOR" and have "///EDITOR" trapped and redireted
>       to launch the user's preferred editor. 
> 
>       More importantly, we could do away with custom configuration files
>       for every single app to configure what it would launch for different
>       operations and filetypes.
> 
> 
> ** Solution Proposed by Leareth <leareth@geocities.com>:
> 
> [NOTE: I don't understand this proposal. In fact, I don't understand
> many of the sentences, I'm just pasting it from an email Leareth
> posted. - jeske@chat.net]
To summary: This is basically a global database proposal with the
LD_PRELOAD twist.

I'm not to happy with this, because:
-) a filename.metadata can be also copied manually quite easy by some
   power user with standard POSIX tools :)
-) The metadata should be as near as possible with the data. Just consider
   an external SCSI harddisc that is taken from one site to another one.
   (With a global db, this looses the metadata in the process.)
-) A global db is an invitation to desaster. One bad block can wreak havoc
   the complete database, while the same bad block with separate metafiles
   breaks only one file.

There are also disadvantages:
* Many, potentially small, files.
* It breaks the normal CLI UNIX abstraction of files. But so do all
  proposals. (With the db approch the person needs to remember to extract
  the info from the db, with directories he needs to remember that the
  file is a directory in reality, and with .metadata he needs to remember
  the companion file.)

So my (andreas@ag.or.at)  proposal would be:
a) Metadata is stored in a file with the same filename but an additional
   .metadata extension.
b) Logically speaking, the metadata is organized as a tree with data nodes
   that have some types like (int, string, filename, etc.)
   This would allow a lowlevel editing tool to be clever how to display
   and edit the information.
c) There is a standard layout for the tree. Example:
   document/mimeType:MIMETYPE,
   document/creator:STRING,
   ...
   appspecific/windowlayout/mainwindow:GEOMETRY
d) If a file doesn't have it's own metadata, it inherits the standard
   file type metadata.
e) The file type is tried first by the file utility, and then as a last
   resort by extension.

Coming to the implementation, there are some points:
*) There must be an ASCII representation. (Debugging aid, and vi is the
   tool when desaster strikes.)
*) Is ASCII proves to slow, some kind of libdb (or libgdbm, etc.) can
   be used, and a utility that translates between ASCII and binary
   representation must be provided.

> Now here's we we get creative. We build both central & user databases,
> the central with a list of files that contain the name(s) of the users
> that need to be updated when the corisponding file is changed. Each
> user, keeps a database of metadat on all files that have been
> extended, includeing files from remote sites. This database will be
> updated by a library that set & get's the data, and is updated by
> libvfs when a file is changed through normal means such as mv. This
This is difficult, as mv is basically a link/unlink or a copy.
And with a copy, all you get is the fact that some program has read the
file a, and wrote the file b. Without VERY EXPENSIVE caching it's
difficult to be sure that a==b afterwards.

Andreas
-- 
Win95: n., A huge annoying boot virus that causes random spontaneous system
     crashes, usually just before saving a massive project.  Easily cured by
     UNIX.  See also MS-DOS, IBM-DOS, DR-DOS, Win 3.x, Win98.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]