Re: [Tracker] Request for review, API allowing external software like MTP daemons to get metadata as correct sparql insert



Hey Martyn,

I have quite a bit on my plate lately with renovating houses and other
stuff, like finding a new customer etc, that I'll delay working on this
for some time for now.

If somebody wants to pick it up, I mostly agree with your review
commends and adapting the branch to fix these issues would be a great
start.

Perhaps I'll pick it up myself starting by fixing these issues. So if
somebody does start adapting the branch a little note would be
appreciated (to avoid doubling efforts).

Kind regards,

Philip

On Fri, 2013-01-04 at 15:20 +0000, Martyn Russell wrote:
On 03/01/13 21:30, Philip Van Hoof wrote:
Hey Guys,

Hello Philip,

Again, is somebody planning to review this? If not I'm planning to go
ahead and push it to master.

I have just reviewed it briefly. My comments:

1. There are some white space issues here which need cleaning up. 
Trailing white space, etc. see 
src/libtracker-extract/tracker-extract-sparql.c.

2. Coding style too.

3. Shouldn't you use the tracker_sparql_* API properly in 
sparql_builder_finish() ? WE're appending strings here but we have APIs 
for inserting URNs and blank notes already which could make this a lot 
easier. We should use these:

   tracker_sparql_builder_object_blank_open() and _close()
   (unless you need to refer to _:file later on of course)

   and

   tracker_sparql_builder_object_iri()

3. Extra spaces before end of function are unnecessary ;)
+       g_clear_error (&error);
+

4. I would have made sparql_builder_finish() take arguments in the order 
we use in TrackerExtractInfo for consistency, but that's me being pedantic.

5. tracker_extract_get_sparql() takes a "temp_file", is teh file 
temporary? Is it a URL or a path? we should be more specific with 
variables like this especially if this is new API.

6. No need for the check here, just strdup it... also, I would keep the 
variable name the same so it's clear what it is we're passing (i.e. a urn).

+       if (graph) {
+               data->graph_urn = g_strdup (graph);
+       }

7. Why not use -1 or a valid time_t for last_mod and last_access? Not 
sure why we need the extra _set booleans here.

+       if (last_mod != 0) {
+               data->last_mod = last_mod;
+               data->last_mod_set = TRUE;
+       } else {
+               data->last_mod_set = FALSE;
+       }

8. This looks a bit wrong to me. First, we always set the file from the 
temp_file variable given, but then we might set the url to be something 
different based on dest_url. Not only is this confusing but I expect it 
to cause problems inside tracker if the file used for the URL != file 
used for the PATH. I can see why you want to do this, but we didn't do 
it this way round with Nokia, we issued a separate update for the file 
details when it actually was saved as something else. The problem I see 
is that you could have a query which returns urn->path as /foo and 
urn->uri as file:///bar and that's clearly wrong:

+       data->storage = tracker_storage_new ();
+       data->file = g_file_new_for_path(temp_file);
+       if (dest_url) {
+               data->url = g_strdup (dest_url);
+       } else {
+               data->url = g_file_get_uri (data->file);
+       }

You could ALWAYS set the url from the file and keep the dest_url as a 
separate struct member to help avoid any confusion and I think it would 
make a lot more sense here.

9. No need to use a variable here, you can just return the function 
value with a cast I imagine:

+       res = g_simple_async_result_get_op_res_gpointer (simple);
+
+       return res;


10. Please put system includes before glib and libraries (coding style):

+#include <glib.h>
+#include <gio/gio.h>
+#include <time.h>
+

11. I wonder why you don't provide a sync API too?

12. This is wrong too:

+#endif /* __LIBTRACKER_EXTRACT_ENCODING_H__ */

13. Copyright should probably be 2013 ;)

14. Why do we change src/libtracker-common/tracker-marshal.list at all?

15. I've not checked the whole on_fileinfo_received() part because it's 
really complex in the miner-fs already and we were fixing problems there 
for years related to parent existence and creation prior to insertion of 
a child path. This worries me about this branch because I get the 
impression from quick review that it's not really addressed fully. You 
also seem to allow parent URNs to not exist before adding a child to the 
database. That's not how we've done things in the miner-fs and this 
really breaks what people using the DB can expect. Correct me if I am wrong.

16. I presume the tracker-storage.[ch] addition to libtracker-extract is 
a straight copy of the files from libtracker-miner? I am not sure if 
this is such a good idea - but I've not been sure what to do here for a 
while now. From a quick git grep on the code, it seems we only use it in 
tracker-writeback, tracker-extract and tracker-miner-fs, and since 
miner-fs already links with libtracker-extract, it looks like the 
logical choice to move it there.

17. If you do plan to add tracker-storage.[ch] to libtracker-extract, 
you need to update the documentation so people know it wasn't added in 
0.8 but in 0.16. ;)

18. White space issues in src/libtracker-miner/Makefile.am

19. I didn't see any patches for the documentation and since you're 
moving the whole tracker-storage.[ch] API to another library, you 
definitely need to do that.

20. For tracker-sparql, I would prefer a cleaner approach. Right now, 
it's really clear the utility is used for querying mainly. I think it 
would be better to use:

   "insert-file-path" instead of "metadata-file-path"

   i.e. insert instead of metadata for all new options there.

It's then much clearer when looking at the options that they're about 
specifying data to insert. It's also shorter :)

I also think that "metadata-file-path" is a bit long, can't we just use 
"insert-path"? The argument clearly states it's a FILE too. Same for 
"metadata-graph-urn", "insert-graph" is shorter and URN is the type it 
expects. Same again for "metadata-dest-url", would be better as 
"insert-url".

Finally, I think that "metadata-available" needs to have a proper value 
type instead of NULL. It's not clear if "true" or "1" or "yes" are 
really the right thing to put in here, we should give an example of the 
boolean system being used here in the description text if it's still not 
clear.

21. White space issues in tracker-sparql.c

22. The reason I asked for a sync API is so we could use it in 
tracker-sparql ;)

--

NOTE: I have not compiled or tested this branch yet, this is just my 
review of the code.

I can do some testing after some of the issues above are addressed - but 
I am a bit concerned to see it just committed in it's current state.

Thanks for the work here all the same Philip!


-- 
Philip Van Hoof
Freelance software developer
Codeminded BVBA - http://codeminded.be




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]