Re: [Tracker] Context extraction API/program

From: Ivan Frade <ivan frade nokia com>
To: ext Andrew Leung <aleung soe ucsc edu>
Cc: tracker-list gnome org
Subject: Re: [Tracker] Context extraction API/program
Date: Tue, 15 Jul 2008 11:23:19 +0300

Hi Andrew,

El lun, 14-07-2008 a las 15:37 -0700, ext Andrew Leung escribiÃ:

I would like to utilize Tracker's file content extraction mechanism  
within my own program. Basically I would like to be able to parse  
various file types and pull out keywords. Does Tracker have any  
mechanism (API/separate program) that can I can use to pull content  
from various file types?


 The content is extracted using the scripts
in /usr/local/lib/tracker/filters/ . These scripts are organized
following the mimetype name, and usually they call external programs to
extract the contents (like wv, pdftotext, ...)

 Tracker obtains the mime-type of the file, decides the category and if
the category "Has full text", calls one of those scripts.

Beagle search has a program called 'beagle-extract-content' that I  
have been using for this purpose though I haven't been particularly  
happy with it. Thanks a lot.


 We have a "tracker-extractor" program. It extracts the _metadata_ of
the file (not the contents). Maybe it is also useful for you.

 Any improvement in the filters/extractors is welcome ;)

 Regards,

Ivan

References:
- [Tracker] Context extraction API/program
  - From: Andrew Leung

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]