Re: [Tracker] Storing Metadata with files



Martyn Russell <martyn-bhGbAngMcJvQT0dZR+AlfA public gmane org> writes:
There seems to be a lot of XMPs though. Are you talking about
http://en.wikipedia.org/wiki/Extensible_Metadata_Platform? Does the XMP
file have to have a special file name, or is the name of the described
file part of the XMP data itself? Googling for "xmp tracker" or "xmp
metatracker" was just as fruitless as searching for "xmp" on the tracker
wiki, so I'm a bit at a loss here...

Yes we are talking about that, we even have some light support in
libtracker-extract. See:

http://library.gnome.org/devel/libtracker-extract/unstable/libtracker-extract-XMP.html


Hmm. To me that looks like I'm actually quite restricted in what
metadata I can put into tracker using XMP sidecar files. Am I right
that I will *not* be able to e.g. insert user defined tags using
sidecars?

Those functions are convenience functions for many extractors (to
avoid code duplication for one). There is nothing to say that can't be
extended, but generally, non-standard data is not represented there
for the moment.

You're the first person to ask for this AFAIK, so there has been no
requirement to anything more than we have now.


I guess that means that I'll have to start coding in C again *sigh*.
What are your thoughts about accepting a patch for such functionality
into the official code base?

What I would like to do is to make it possible to have a tighter
coupling between the indexed files and additional metadata about them.

As I wrote a little while ago, I'm trying to adopt tracker for managing
a document archival system. That means that files are exclusively added
to the indexed paths by a dedicated program that asks the user for
additional metadata and then somehow makes sure that this metadata ends
up in the tracker database as well.

My first attempt was to just put the document into the indexed directory
and then directly add the metadata into tracker using e.g. the dbus API.
I don't like this solution very much for two reasons. First of all, I
have to poll tracker to determine when the new document has been indexed
and I can add the additional metadata. Secondly, I don't feel
comfortable with metadata and document to be separated that much. So far
I have the impression that tracker considers metadata to be mostly
transient in the sense that it can always be recovered from the file
itself, and in my case this would no longer be true. I don't have a
particular scenario in mind, but I feel that it's basically asking for
trouble if the simple act of renaming a file (or the indexed directory,
or temporarily changing the tracker index setting)s, would permanently
destroy all the associated metadata (even though no one is supposed to
do anything like that).


Therefore I like the idea of storing the metadata in a separate file.
While this is not foolproof either, I still think that it significantly
decreases the chances of actually loosing the metadata (although it may
become disassociated from the document). It also happens to be that this
is roughly the way system works currently, only that swish-e is used for
indexing he metadata files and that they actually contain the entire
plain text of the document as well.


So far it seems to me that the best approach to get this to work with
tracker would be to either extend the XMP sidecars extractor to extract
more information, or to add an entirely new extractor that reads a
tracker-specific separate metadata file. But maybe there also an
entirely different way to achieve what I want?


Best,

   -Nikolaus

-- 
 ÂTime flies like an arrow, fruit flies like a Banana.Â

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]