[Tracker] The extract-sparql branch and point 15 in Martyn's review



Team,

Although it's my own fault for not having finished the work[0] after Martyn's excellent review[1] ...

In point 15 Martyn correctly points out that parent URNs for files aren't addressed in the extract-sparql branch. This means that whenever you want to add metadata for a (future) file by using tracker_extract_get_sparql to get the metadata from our extractors, you don't get the directory tree's metadata alongside with it.

ie. Asking for the metadata for /home/user/Documents/Directory/Kind/File.png will give you metadata for File.png, although File.png might not exist yet as you're building it as /tmp/.tmp.File.png, but not for /, /home, /home/user, /home/user/Documents, /home/user/Documents/Directory and /home/user/Documents/Directory/Kind. And this although your plan is to rename() /tmp/.tmp.File.png to /home/user/Documents/Directory/Kind/File.png meaning that for the query you get back to succeed, you'll need it all (including the metadata for the destination's tree).

So although this is my own fault (for not having finished it by now), I recognize today (almost a year later) that I'm not following up on this enough. I'm hoping that since apparently I can't find the time for this (I've been renovating a house, having a girlfriend (doing stuff grown up people apparently 'need' to do), having a daytime job and not being paid to do Tracker work anymore, etc, being guilty of being lazy and watching dumb television instead of coding - I agree -), the other members of the team would be willing to progressively refactor the FS miner in such a way that getting this tree of metadata out of the FS miner can be done by a library like libtracker-extract. By that I mean blocking the FS miner from doing it, while getting the information and building the correct SPARQL queries.

I'm wondering about the ideas of the other Tracker team members on this. I think it's important to allow applications to bypass the FS miner. Not because I think the FS miner isn't great (I think it is, I think Lanedo's developers did an awesome job). But because I think our metadata system should be flexible to allow such metadata input. And because it avoids race conditions and metadata entry timing issues (allowing metadata upfront the rename() call, allows us to atomically do things fully correct with the tracker:available property of informationelements).

We're so close. We should just do it and get it over with.

Kind regards,

Philip


[0] https://mail.gnome.org/archives/tracker-list/2012-December/msg00021.html
[0] https://git.gnome.org/browse/tracker/log/?h=extract-sparql
[1] https://mail.gnome.org/archives/tracker-list/2013-January/msg00001.html






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]