[Tracker] [WIP] Application support: man pages, Tomboy, & Liferea



This isn't quite worked out, but I want to throw this out to the group
and get some preliminary feedback.  Attached is a patch that allows us
to index system-wide and user installed man pages, Tomboy notes, and
some basic Liferea support.  The external services all use the
out-of-process mechanism used by the text filter and embded metadata
extractor.  However, there are more operations, and therefore, more
applications for each service.

First, the directory structure:
in tracker/src  there now resides an "external-services" directory.
In this directory you will find one directory for each service.  The
service directories are named after their configuration key in
~/.Tracker/tracker.cfg.  This makes it easy to add new services with
out recompiling trackerd (and hopefully encourage other developers to
provide tracker support with their apps!).  For example, you'll find
the directory tracker/src/external-services/IndexManPages and a
IndexManPages key under the Services group in tracker.cfg.  Each
service has five programs:

1) check-deps
 This program is called in the very begining, if the user actives the
service's key.  This program may check for any other required programs
that is needed for this service to work.  For example, I check for
xsltproc and w3m for the Liferea indexer.  If non-zero is returned,
the indexer is disabled.

2) watch-list
 This program returns a list of directories to be added to trackerd's
watch list.  You must list each directory, it will not automatically
recurse all subdirectorys.  If you need all subdirs, I recommend find:
# find $basedir -type d
See IndexManPage/watch-list for an example.

3) service-type
This progam returns the service type of a file being watched by this service.
argv[1] == the full path to the file being watched
argv[2] == the mime type of the file
I provides the file path and mime, if you need it, but I imagine this
should be constant

4) filter-text
This works very similar to the text filters you find in the
tracker/filters directory, except
argv[1] == the full path
argv[2] == the mime type of the file !!
argv[3] == the path to the filtered text !!

5) extract-metadata
Again, behaves like tracker-extract. It takes a file and splits out
Key=Value;\n pairs for each piece of metadata
argv[1] == the full path
argv[2] == the mime type of the file


So, like I said before, I'm including 3 implementations of this:

1) IndexManPages
The new service type is "Man Pages" and it adds a new "Man" metadata
class.  The class can tag a man page's title, section, date it was
written, source (app + version), and manual name (eg, Debian Project
for debian specific man pages).  It also provides a full text indexer.
Only thing lacking here is the language the man page was written in.
Currently, I reject any non-english directory.  It's easy to index
them all, but it's just faster for me if trackerd just ignores those.

2) IndexTomboy
This uses the Notes service type, and adds a Title field to the Note
metadata class.  There's obviously more I could grab from the tomboy
files, I just haven't gotten around to it yet.  Full text is
supported.

3) IndexLiferea
This adds a service type called "Web Channels" and a metadata class
"RSS".  This indexer sucks and I need some help on it. :(
Currently, you only get one entry in the database for each feed.  So
all the text in the feed is associated with the entire feed, instead
of an individual item.  For example, if I was to search for "tracker"
I'd expect a link to a specific post by Jamie, instead I get a link
planet gnome.  I'm not even sure what I need here, I'd like some way
to associate a file with multiple database items.  Is this possible?


I'm pretty happy with the man pages indexer, I may look into having
Yelp use some time in the future.  But I'm not calling dibs, so anyone
else looking for an project to work on is more than welcome.

The tomboy indexer works as expected also.  I belive Tomboy is
dbus-ified, so if any one wants to update tracker-search-tool to
search Notes also and fire up with Tomboy when you click on a note,
that'd be awsome.


The included patch also updates tracker-search and libtracker, so you
can search for the "Man Pages" and "Notes" service types.

Attachment: tracker-external-services.patch
Description: Text Data



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]