Re: [Tracker] Writing custom extractors docs?
- From: Martyn Russell <martyn lanedo com>
- To: "Spivak, Max" <spivak lab126 com>
- Cc: "tracker-list gnome org" <tracker-list gnome org>
- Subject: Re: [Tracker] Writing custom extractors docs?
- Date: Fri, 05 Feb 2010 09:53:41 +0000
On 05/02/10 04:32, Spivak, Max wrote:
Hi there,
Hi,
I've been looking at tracker and I like it a lot. Great job.
Thank you!
I'm looking to write custom extractors for the 0.7.x version. I'm
wondering what docs exist.
OK.
I found
http://library.gnome.org/devel/libtracker-extract/unstable/libtracker-extract-tracker-extract.html
and http://library.gnome.org/devel/libtracker-common/unstable/ -- is
this valid and current? Any other docs?
The best place to start is:
http://live.gnome.org/Tracker/Documentation/
On that page, as you have already found, there are links to
libtracker-extract and libtracker-common.
http://library.gnome.org/devel/libtracker-extract/unstable/
http://library.gnome.org/devel/libtracker-common/unstable/
How is the custom extractor registered with tracker-extract? Is it
just that the libextract-<abc>.so is present in the
lib/tracker-extract/tracker-0.7/extract-modules directory or is
something else necessary?
I think having this documentation is perhaps not enough. We should add
to this so make this step clearer.
Essentially, yes. All that has to happen is your .so has to be in the
directory:
$prefix/lib/tracker-0.7/extract-modules/
There are some other checks made when the library is loaded (like you
have the right functions in your library). Specifically:
tracker_extract_get_data().
Is there a registry that maps a document's file extension to its
mimetype? Say I have a<filename>.abc -- what maps it to
libextract-abc.so. This is especially interesting if I have custom
documents for which I invented an extension and a mime type.
This is a good question. Ultimately you have to have a mime type for
that file and that mime type is what you put in your extractor as
documented in the example for libtracker-extract.
If your mime type is not registered, you need to do some magic with
shared-mime-info to fix that. See:
$prefix/share/doc/shared-mime-info/shared-mime-info-spec.pdf
I can't remember exactly the details right now, but it isn't too
difficult from what I remember. If you need more help with this, let me
know.
I've run across some posts that tracker will/may use
LibStreamAnalysers from Strigi. Should I use LSA for my extractors or
not?
Not right now. It doesn't push the data into tracker-store correctly
(mostly because it needs updating after some recent changes) and also it
is exclusively available, that meaning, we don't extract with both our
inhouse/3rd party extractors AND LSA, but one or the other. Some work is
needed here to allow them to be used together but also to fix the LSA
extractor.
The single biggest problem, assuming everything else works for the LSA
extractor, is that our ontology and the ontology LSA uses do not exactly
match and this causes quite some warnings in tracker-store's logs.
--
Regards,
Martyn
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]