Re: [Tracker] Result of GetMetadata DBUS Method



Le Sun, 25 Sep 2011 10:02:41 -0400,
Nikolaus Rath <Nikolaus rath org> a écrit :

Adrien Bustany <abustany-rDKQcyrBJuzYtjvyW6yDsg public gmane org>
writes:
Le Sat, 24 Sep 2011 18:50:06 -0400,
Nikolaus Rath <Nikolaus-BTH8mxji4b0 public gmane org> a écrit :

Adrien Bustany
<abustany-rDKQcyrBJuzYtjvyW6yDsg-XMD5yJDbdMReXY1tMh2IBg public gmane org>
writes:
In tracker 0.8.17, an example result of the GetMetadata DBUS
method is

 a nfo:PaginatedTextDocument ;
        nie:title "SV Meldung" ;
        nco:creator [ a nco:Contact ;
        nco:fullname "nikratio"] ;
        nie:contentCreated "2011-08-10T20:12:38Z" ;
        nao:hasTag [ a nao:Tag ;
        nao:prefLabel "()"] ;
        dc:format "application/pdf" ;
        nie:description "()" ;
        nfo:pageCount 1 ;
        nie:plainTextContent "blablabla" .

With tracker 0.10.21, however, the same document now gives

 a nfo:PaginatedTextDocument ;
        nie:title "SV Meldung" ;
        nco:creator [ a nco:Contact ;
        nco:fullname "nikratio"] ;
        nie:contentCreated "2011-08-10T20:12:38Z" ;
        dc:format "application/pdf" ;
        nie:description "()" ;
        nao:hasTag ?tag1 ;
        nfo:pageCount 1 ;
        nie:plainTextContent "blablabla" .
} } WHERE { {
?tag1 a nao:Tag ; nao:prefLabel "()" .


I don't know enough about the syntax used here, so I won't
claim that it's wrong. However, rdflib fails to parse this
and thus breaks my application.

Can someone tell me if this is a bug in tracker or in rdflib?

Could you share the error message that rdflib gives you?

Sure, it's at
http://code.google.com/p/rdflib/issues/detail?id=190

Most likely rdflib doesn't like the fact that the WHERE of the
INSERT is not closed?

rdflib actually tries to parse this as Turtle RDF. But unless I'm
reading the specs wrong, Turtle does not have a WHERE, so am I
right to assume that GetMetadata has been changed to no longer
return valid Turtle RDF, but partial SPARQL statements?

If so, this makes it very hard to use GetMetadata() for anything
else but tracker. Is there any way to get something that's easier
to parse from tracker-extract?

Well, tracker-extract actually *never* returned turtle, at least as
far as I know (now, I'm not completely familiar with that
component). I guess the fact that previous versions returned SPARQL
that looked like turtle is at most a coincidence :) (note: turtle
does not have INSERT either).

Well, extract-metadata does not return an INSERT. Until the WHERE
started to appear, it was always valid turtle.

No, I guess the INSERT is added in the FS miner


GetMetadata being an internal API call, it's not exactly
meant to be easy to use for 3rd party apps, and it's actually tied
quite tightly with the SPARQL the the FS miner generates. I'm not
sure of what would be the easiest way for you to use that API
now :/ Getting the complete SPARQL and inserting it into an
in-memory db (for example Sesame) before querying it would be an
option, but that really sounds complex and sub-optimal... A simpler
option would be to apply a few transformations to the SPARQL to
make it look like turtle again, but that does not sounds like a
bullet proof solution.

Would it maybe be possible to add an additional API call, say
ExtractPlaintext that returns just the (unstructured) plain text and
could easily be used by 3rd party apps?


Best,

   -Nikolaus





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]