Re: [Tracker] Tags, metadata and date fields



On 04/03/2010 09:56 AM, Adrien Bustany wrote:
I am currently using a set of self-developed scripts to manage a
document archival system. When a document is added to the archive,
the user has to assign a name, tags from a predefined list, a date and
optionally a couple of keywords. The script then extracts the plain
text from the document, adds the additional metadata and writes
out an XML file. The XML file is then indexed by Swish++.

For retrieval, another script asks for keywords, tags and a date-range,
invokes Swish++ to do the search and retrieve the metadata, presents a
nice list with results and finally opens the original document.

I was wondering to what extent it would be possible to replace this
system with something more established like tracker. From the
documentation, I gather that it is possible to assign tags to the
documents. But can I also add additional meta data, like title, keywords
and timestamps and show them in the result list? Is it possible to
constrain the results to a given time span?

Yes, Tracker provides ontologies (= metadata standards) for tags, date,
keywords and stuff. Tracker uses the Nepomuk ontologies
(www.semanticdesktop.org/ontologies/) to describe its data.
The queries are then done with a langage called SPARQL. SPARQL is a very
powerful languages, and allows you to of course provide dates constraints,
but also any type of constraints. I invite you to look at a SPARQL
introduction, or at the Tracker documentation for that
(http://live.gnome.org/Tracker/Documentation).

The querying with SPARQL looks quite promising indeed, thanks for the
quick answer!

However, I wasn't really able to find any information on how to get the
metadata into my index in the first place. Could you give me some
information where to start? Do I move the documents into a
tracker-watched folder and manually modify the database after the
documents have been indexed? Or do I explicitly ask tracker to add a
specific file with specific metadata to the index? Or do I add the
metadata as extended attributes of the file and put it in a watched
folder? Or something else entirely?


Best,

   -Nikolaus

-- 
 ÂTime flies like an arrow, fruit flies like a Banana.Â

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]