Re: Making metadata storage SQL-driven

cdtEl sáb, 03-09-2005 a las 10:37 +0100, Jamie McCracken escribió:
> per-system metadata is out of scope for this. You would need a root 
> based solution for this (bad idea) which will compromise privacy of data 
> and security

This is completely untrue and unfounded.  Unless you're talking about
building a hybrid metadata-store/metadata-search-engine, which would
kind of stomp on Beagle, and probably not do any of the two things

>  (cf the flak google desktop search got when it was 
> initially released system wide). I dont want others seeing whta docs i 
> have in my home folder (be it through thumbnails or by author)!

I am under the impression that my earlier text has been misunderstood a
*hundred* percent.  I advocate the storage of metadata in extended
attributes, for a number of reasons.  I am not trying to advocate any
particular approach for metadata-caching/search-engine for metadata.

> you cant use EAs exclusively because they are only supported on Linux 

you're rehashing what I already said!  Plus Solaris also does support
them (if, at this point, you have not started to research on the
subject, I'd advise you to do so first).  It's widely known that EAs
don't have 100% support among all UNIXes.  For those cases, you simply
provide a "default" fallback backend, and that's it.  The Beagle guys
did.  You don't see anyone complaining.

> and on certain filesystems and you can only set them if you have write 
> permission on the file.

Evidently.  And that is exactly the semantics you're looking for.  Why
on Earth should you be setting metadata on a file you cannot write to?
This is a question with two answers, per-file and per-user metadata.
Per-file metadata should be set in EAs.  Per-user metadata, we'll see,
but we've been doing fine with the nautilus metafiles so far, right?

>  The other issue is that EA's are limited to 
> key/value pairs so you cannot *efficiently* use them for relational data 
>   such as contextual stuff.

I *never* advocated that use.  I merely spotlighted that storage
concerns should not submit to querying concerns.  If you want to perform
relational magic with EAs, read them off disk and cache/index them, just
like you read *data* and cache it with Beagle, for fast searches.

>  (they are also limited to strings so no 
> blobs like thumbnails and maximum size of all EAs on a file is 64KB)

I could not tell for sure if this sentence is packed with truth.

> > 
> > * Performance is not the biggest concern here.  It's functionality and
> > coherence!  I really, really propose per-system metadata should be
> > stored in an *extended attribute*: it's a POSIX standard, and it will
> > make cooperation between projects (I'm thinking KDE and console-based
> > utilities here) much, much, much, much easier and less troublesome.  If
> > current implementations have bugs, it's time to start putting pressure
> > on implementors to fix their fucked-up things and get on with it.
> > 
> > * Even if EAs turned out to be horribly slow (which they are not, on
> > most filesystems) and inefficient, you still need a common ground for
> > many apps to get metadata.  Command-line apps and the like.  KDE.  Do
> > you honestly expect they'll link to your metadata libraries?  The KDE
> > guys haven't linked to glib ever, so I say, hell no way that's gonna
> > happen!  You *need* to store and read them from extended attributes.
> > I'm sure you'll then build a cache or something, or even start using
> > Beagle to cache and query metadata.
> > 
> Thats right, EA's are not centrally indexed AFAIK. So you would still 
> need a DB or indexer to store all the EA's values centrally if you 
> wanted to perform a search efficiently

So help the Beagle developers then!

>  (else you would have to go to 
> every single file on disk and retrieve its EA's during a search which is 
> clearly unacceptable performance wise). Note KDE4 are using postgres 
> RDBMS for their metadata framework (tenor) and AFAIK they are not using 
> EA's (or at least I haven't see that in their plans so correct me if Im 
> wrong here)

I've been close to the Tenor devs and let me tell you: they are *not*
building a metadata search engine (that's the Kat devs), but a
contextual linkage engine, the representation of which requires a
completely different approach.  Nevertheless they've "married"
themselves to the Kat infrastructure so as to cut RTM (release to Me :-)
time.  And so they're cooperating.  That they're using a RDBMS for this
is only the consequence of the tools David Wheeler knew at heart when he
started on his thesis, and that the Kat guys already use a RDBMS.

> > Start lifting use cases from competing operating systems (Mac comes to
> > mind), and come up with a few use cases yourselves.  Examples:
> both Mac and beagle (and KDE4 and vista) store them in their own 
> databases! They do this for speed and superior querability.

I told you to research use cases, not implementation strategies.

I cannot argue this particular point, because I'm an ignorant regarding
the Mac thing.  But the fact that you use EAs to store metadata does not
preclude *you* from building a query engine which *uses* that data.

I'm getting really tired of having to rehash every single little detail
and explain advantages of one implementation vs. another.  I feel like
you're not responding to my e-mail but rather randomly quoting your own
ideas - this is probably because we operate under different assumptions.
I think I've already explained why, in my opinion, metadata should be
stored in EAs.

>  Mac and 
> windows can use EAs and index them too as they control their own 
> platform but we dont and thats why neither KDE, Gnome or beagle ( which 
> only uses them to tell if they have indexed a file) can rely on them.
> afinptwy
> (do we really need to continue on this thread? If alex is happy to 
> accept a dbus intereface for a metadata server in Nautilus then we dont 
> need to worry whether its text based, EA based or DB based.

I already told you why I think it's important to "worry" (to use your
word) about where the stuff is stored.  It's not like "abstracting" the
stuff will make the ugly problems go away.

>  Nauitlus 
> already has such a search interface which beagle can use so I intend to 
> make use of that interface too)
> -- 
> Mr Jamie McCracken
Manuel Amador                   <rudd-o amautacorp com>            +593 (4) 220-7010

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]