Re: [Tracker] Using tracker extractors from other applications
- From: Nikolaus Rath <Nikolaus rath org>
- To: tracker-list gnome org
- Subject: Re: [Tracker] Using tracker extractors from other applications
- Date: Thu, 18 Nov 2010 20:47:33 -0500
Ivan Frade <ivan frade-Re5JQEeQqe8AvxtiuMwx3w public gmane org> writes:
Hi,
On Wed, Nov 17, 2010 at 5:16 PM, Nikolaus Rath <Nikolaus-BTH8mxji4b0 public gmane org> wrote:
Martyn Russell <martyn-bhGbAngMcJvQT0dZR+AlfA-XMD5yJDbdMReXY1tMh2IBg public gmane
org<martyn-bhGbAngMcJvQT0dZR%2BAlfA-XMD5yJDbdMReXY1tMh2IBg public gmane org>>
writes:
On 17/11/10 00:02, Nikolaus Rath wrote:
This gives me the following result:
...
So it seems that I still have to parse the entire string. Is there a way
to get the data in more structured form?
We have code in python doing that for our extraction test cases. Check the
class ExtractorHelper in:
http://git.gnome.org/browse/tracker/tree/tests/functional-tests/common/utils/helpers.py
Basically:
extractor = ExtractorHelper ()
results = extractor.get_metadata (filename)
results is a dictionary with property as key, and a list of values as value.
There are few tricks in the translation property->key (because of anonymous
nodes)... but i think that with some prints you can figure out how it works.
This seems to work fine for the OpenOffice file:
In [39]: tracker_helper.ExtractorHelper().get_metadata('file:///home/nikratio/misc/Transaktionen.ods',
'').keys()
Out[39]:
[u'a',
u'nie:generator',
u'nie:plainTextContent',
u'nco:publisher',
u'nie:contentCreated']
but for the plain text file, there are no contents:
In [42]: tracker_helper.ExtractorHelper().get_metadata('file:///home/nikratio/misc/victoria.tex', '').keys()
Out[42]: [u'a']
Even though tracker itself returns the contents:
In [48]: proxy = bus.get_object('org.freedesktop.Tracker1.Extract',
'/org/freedesktop/Tracker1/Extract')
tracker = dbus.Interface(proxy, 'org.freedesktop.Tracker1.Extract')
tracker.GetMetadata('file:///home/nikratio/misc/victoria.tex', '')
Out[52]:
(dbus.String(u''),
dbus.String(u' a nfo:PlainTextDocument ;\n\t nie:plainTextContent "blablabla" .\n'))
Is this a bug in ExtractorHelper?
Best,
-Nikolaus
--
ÂTime flies like an arrow, fruit flies like a Banana.Â
PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]