Re: [Tracker] Using tracker extractors from other applications



Ivan Frade <ivan frade-Re5JQEeQqe8AvxtiuMwx3w public gmane org> writes:
Hi,

On Wed, Nov 17, 2010 at 5:16 PM, Nikolaus Rath <Nikolaus-BTH8mxji4b0 public gmane org> wrote:

Martyn Russell <martyn-bhGbAngMcJvQT0dZR+AlfA-XMD5yJDbdMReXY1tMh2IBg public gmane 
org<martyn-bhGbAngMcJvQT0dZR%2BAlfA-XMD5yJDbdMReXY1tMh2IBg public gmane org>>
writes:
On 17/11/10 00:02, Nikolaus Rath wrote:

This gives me the following result:

...

So it seems that I still have to parse the entire string. Is there a way
to get the data in more structured form?


 We have code in python doing that for our extraction test cases. Check the
class ExtractorHelper in:

http://git.gnome.org/browse/tracker/tree/tests/functional-tests/common/utils/helpers.py

  Basically:
extractor = ExtractorHelper ()
results = extractor.get_metadata (filename)

results is a dictionary with property as key, and a list of values as value.
There are few tricks in the translation property->key (because of anonymous
nodes)... but i think that with some prints you can figure out how it works.

This seems to work fine for the OpenOffice file:

In [39]: tracker_helper.ExtractorHelper().get_metadata('file:///home/nikratio/misc/Transaktionen.ods', 
'').keys()
Out[39]: 
[u'a',
 u'nie:generator',
 u'nie:plainTextContent',
 u'nco:publisher',
 u'nie:contentCreated']

but for the plain text file, there are no contents:

In [42]: tracker_helper.ExtractorHelper().get_metadata('file:///home/nikratio/misc/victoria.tex', '').keys()
Out[42]: [u'a']

Even though tracker itself returns the contents:

In [48]:     proxy = bus.get_object('org.freedesktop.Tracker1.Extract', 
                                    '/org/freedesktop/Tracker1/Extract')
             tracker = dbus.Interface(proxy, 'org.freedesktop.Tracker1.Extract')
             tracker.GetMetadata('file:///home/nikratio/misc/victoria.tex', '')
Out[52]: 
(dbus.String(u''),
 dbus.String(u' a nfo:PlainTextDocument ;\n\t nie:plainTextContent "blablabla" .\n'))


Is this a bug in  ExtractorHelper?


Best,
 
   -Nikolaus

-- 
 ÂTime flies like an arrow, fruit flies like a Banana.Â

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]