Re: [Tracker] Context extraction API/program
- From: Ivan Frade <ivan frade nokia com>
- To: ext Andrew Leung <aleung soe ucsc edu>
- Cc: tracker-list gnome org
- Subject: Re: [Tracker] Context extraction API/program
- Date: Tue, 15 Jul 2008 11:23:19 +0300
Hi Andrew,
El lun, 14-07-2008 a las 15:37 -0700, ext Andrew Leung escribiÃ:
I would like to utilize Tracker's file content extraction mechanism
within my own program. Basically I would like to be able to parse
various file types and pull out keywords. Does Tracker have any
mechanism (API/separate program) that can I can use to pull content
from various file types?
The content is extracted using the scripts
in /usr/local/lib/tracker/filters/ . These scripts are organized
following the mimetype name, and usually they call external programs to
extract the contents (like wv, pdftotext, ...)
Tracker obtains the mime-type of the file, decides the category and if
the category "Has full text", calls one of those scripts.
Beagle search has a program called 'beagle-extract-content' that I
have been using for this purpose though I haven't been particularly
happy with it. Thanks a lot.
We have a "tracker-extractor" program. It extracts the _metadata_ of
the file (not the contents). Maybe it is also useful for you.
Any improvement in the filters/extractors is welcome ;)
Regards,
Ivan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]