Re: [Tracker] Request for review, API allowing external software like MTP daemons to get metadata as correct sparql insert



On Fri, 2013-01-04 at 15:20 +0000, Martyn Russell wrote:
On 03/01/13 21:30, Philip Van Hoof wrote:

I'll reply to the most important parts of your review:

8. This looks a bit wrong to me. First, we always set the file from the 
temp_file variable given, but then we might set the url to be something 
different based on dest_url. Not only is this confusing but I expect it 
to cause problems inside tracker if the file used for the URL != file 
used for the PATH. I can see why you want to do this, but we didn't do 
it this way round with Nokia, we issued a separate update for the file 
details when it actually was saved as something else. The problem I see 
is that you could have a query which returns urn->path as /foo and 
urn->uri as file:///bar and that's clearly wrong:

Yes the separate update was/is the solution for Nokia's N9. The idea
here or now is that no separate update would be needed at all with this
solution: the MTP daemon does both the extraction and the initial sparql
insert in one pass.

And then the MTP daemon does a rename() call from tempfile to dest-url.
Completing the file transfer.

That sparql insert can happen while the download takes place (allowing
the MTP daemon to insert protocol based metadata before file data is
transferred and available - for extraction).

15. I've not checked the whole on_fileinfo_received() part because it's 
really complex in the miner-fs already and we were fixing problems there 
for years related to parent existence and creation prior to insertion of 
a child path. This worries me about this branch because I get the 
impression from quick review that it's not really addressed fully. You 
also seem to allow parent URNs to not exist before adding a child to the 
database. That's not how we've done things in the miner-fs and this 
really breaks what people using the DB can expect. Correct me if I am wrong.

Right, parent URNs aren't yet created with this solution and should
probably be.

I guess putting INSERT queries upfront the metadata one needs to be
possible as well as getting a separate query for in case the MTP daemon
wants to do multiple passes. ie. First protocol based metadata insert,
which needs the parent URNs at that point and which doesn't yet need the
file content to be available, then extraction based metadata insert; for
which the file content must be available.

And what this creates must also be compatible with what miner-fs will
otherwise create.

The idea here is that while a (large) file is being transferred,
protocol based metadata is already in the Nepomuk store available for
applications that are interested in tracker:available-false 'to be
deployed' data. ie. a DivX or DVD being transferred and a movie
application saying: "DVD with title XYZ with actors ABC and written by
DEF is being transferred to your phone (20% completed)". And After that
an extraction to get extra metadata that isn't included in the protocol
based metadata (and all that without needing miner-fs).

16. I presume the tracker-storage.[ch] addition to libtracker-extract is 
a straight copy of the files from libtracker-miner? I am not sure if 
this is such a good idea - but I've not been sure what to do here for a 
while now. From a quick git grep on the code, it seems we only use it in 
tracker-writeback, tracker-extract and tracker-miner-fs, and since 
miner-fs already links with libtracker-extract, it looks like the 
logical choice to move it there.

Right

22. The reason I asked for a sync API is so we could use it in 
tracker-sparql ;)

Not sure if a sync api is really needed since it would just wrap a
GMainLoop anyway. But could be done for convenience.


Kind regards,

Philip


-- 
Philip Van Hoof
Software developer
Codeminded BVBA - http://codeminded.be




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]