Metadata Storage Daemon

Alright, so there was a quick chat in the IRC last night where a few
of us realized that we wanted a simple metadata storage implementation
to try and centralize whats going on all crazy-like with several
different daemons all coming from completely different directions. My
rough proposal is really 2 part
1) A simple metadata storage service over dbus would be quite simple,
obviously better API's cost us more time and energy, but the backbone
of such a system is extremely rudimentary. I propose that we just go
ahead and write one. No desktop search or filters etc. Just a few
calls exposed to dbus to store, query and delete Triples (A Combo of
some uniqueid, data, and the datatype/metadata). At its core this is a
sqlite db with a little extra work.
2) We take what we learn from the simple implementation and build it
into a Xesam spec for metadata storage. As well as building an
'official' Gnome ontology.

While the strength of the current Xesam Query spec is a great
indicator of how planning can design a wonderful system, I think
metadata is slightly different. Any true store (that reaches the
universal acceptance needed for ubiquity) needs to be generic, _any_
metadata about _any_ source, with social rules governing where and how
data is labeled. Since I had about an hour to kill this evening, I
sloshed together some python to outline what I am getting at. The
hodgepodge system I see as most prudent would handle an MP3 as follows

file:///home/kevin/music/song.mp3 | dc:title | Cool Song
file:///home/kevin/music/song.mp3 | dc:author | Great Band
file:///home/kevin/music/song.mp3 | music:rating | 4
file:///home/kevin/music/song.mp3 | gnome:tag | Star

Or some files like
file:///home/kevin/Documents/hippo.odt | gnome:project | ZooAnimals
file:///home/kevin/Documents/hippo.odt | gnome:project | FatAnimals
file:///home/kevin/Documents/giaraffe.odt | gnome:project | ZooAnimals
file:///home/kevin/Documents/zebrah.odt | gnome:project | ZooAnimals

We throw in basic timestamping of all actions and I think we have 90%
of the desktops metadata storage needs covered. The best part is that
the footprint would be minuscule, and the code relatively stable.
While a query system that supports wildcards etc would probably we way
better, I more just wanted the idea to show. I used SQLObject since it
makes life painless and I wanted to finish both this e-mail and the
sample code in under an hour. Combined with proper namespacing of
applications etc. This is all we really need at the core (maybe a few
more columns or indexies). Anyways,  Please share API thoughts so we
can at least pick a general direction. I would be really interested to
know a little more about the more elaborate potential use cases.
Honestly, I see 80% of use being:
1) Add lots of attributes for a Uri
2) Query for all attributes associated with a Uri or Query for a
specific attribute associated with a Uri
3) Query for Uri sets that have a certain value in a certain
attribute. * (This starts to venture into the realm of our indexers
obviously this is a regular use case, and we would need it plenty, I'm
just noting that any spec we try to make from this should probably
_count_ on the other desktop searches indexing their metadata, so we
really just filter on them.)

Anyways, the blob of silly test code is in a bzr brach at
so feel free to bzr branch away.

I know this isn't at all near a full implementation or spec, but I did
want to get the ball rolling on it, as it seems like a lot of people
agree that an ultra-discreet (and part of Gnome proper) system for
storing and querying metadata is in the near future.

Kevin Kubasik

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]