Hi there,
This is a new version of the patch.
Garnacho: You want to review the part where I'm changing your precious
src/tracker-indexer/tracker-module-metadata.c at line 1016 in this diff.
Ivan: Continue committing in the "turtle" branch. I'll cope by merging
your changes into the next patch and using some SVN command-line tricks.
Martyn: Re-review ;-). I commented inline (see lower)
All you guys: Have fun
1. Please remember in the next patch to svn add the files which are
new so people can compile with the applied patch.
Done this time
2. Use AM_CONDITIONAL in configure.ac:
+if test x$have_raptor == "xyes"; then
+ AC_DEFINE(HAVE_RAPTOR, 1, [Raptor RDF parsers])
+fi
Everywhere else does:
AM_CONDITIONAL(HAVE_RAPTOR, test "$have_raptor" = "yes")
AM_CONDITIONAL is not the same as getting the define in config.h.
There's also a discussion started about making libraptor mandatory.
The conclusion of that discussion will influence this configure.ac
magic.
I left this as is (using AC_DEFINE)
3. Do we require a particular version of raptor?
Not at this moment. Also this is affected by the discussion about
libraptor taking place. Not adapting the patch for this yet.
4. Do we need to check S.Enabled here and/or the Volumes table at all?
Also, isn't it more logical to use "M" instead of "F" for the
metadata
[CUT SQLite stored statements]
This is under Ivan's supervision. This new patch has some updates about
the things that I will in this reply annotate as Ivan's responsibility.
Not everything being done by Ivan is already in this patch yet, though.
Ivan is continuing with fixing the issues in the "turtle" branch. I will
continue merging my merged checkout of trunk with his improvements.
So far these three are the merge commands executed:
svn merge -r 2555:2615 ../branches/turtle/
svn merge -r 2615:2680 ../branches/turtle/
svn merge -r 2680:2682 ../branches/turtle
About this specific issue, I think I saw Ivan commit something about
these queries in the turtle branch. I also noticed some differences in
the original merge patch, and this new one, about these queries.
I'll let Ivan reply if he wants to about this.
5. Is the notation "(out)" the same notation used in gtk-doc now?
+ * @mount_mount: (out): if @path is on a removable device, the mount
point will
+ * be filled in here
I have removed the annotations for now. The GObject-Introspection's
annotation format has not yet been finalized. I have added a humanly
explained comment about how to deal with @mount_point.
6. This needs fixing in tracker-hal.c, also it doesn't say if it
returns
TRUE or FALSE, usually that's a good idea:
+ * Returns Whether or not @path is on a known removable device
+ *
+ * Returns: Whether or not @path is on a known removable devic
Fixed in the "Returns" annotation. Same here, I am not yet using the
GObject-Introspection annotation syntax. I think it would be better if
we make one huge patch adding annotations everywhere, than to do it
sporadically API per API.
6. This doesn't need to be cast:
+ udi = (const gchar*) key;
Fixed
7. Indentation here is wrong and we could use g_strcmp0() instead so
we
don't have to check mp for NULL too:
+ if (mp && strcmp (mp, path) != 0) {
+ if (g_strrstr (path, mp)) {
Done
8. This looks like a memory leak, volume is not freed:
if (available)
*available = libhal_volume_is_mounted
(volume);
break;
It's not. Volume is freed. Double checked this. Maybe the original patch
didn't have it. This now one, in any case, has it.
+ libhal_volume_free (volume);
10. The changes to tracker_data_metadata_insert() actually look like
they fix a potential leak in TRUNK, i.e. inserting the same thing
twice. Nice catch.
Sure. Carefully review all those, though. I went over all users of the
APIs to make sure they are still all validly using it. But feel free to
add some extra eyes to that process.
11. This looks like it would cause a crasher now that we don't own all
data sent to libtracker-data/tracker-data-metadata.c, I don't think we
should be freeing the list there or its contents:
- g_return_if_fail (TRACKER_IS_FIELD (field));
+ if (!field) {
+ g_warning ("Field name '%s' has isn't described in the
ontology",
field_name);
+ g_list_foreach (list, (GFunc) g_free, NULL);
+ g_list_free (list);
+ return;
+ }
+
This was indeed different in the Turtle branch and has indeed been
refactored in Trunk. I have merge-rewritten these functions to act the
way they are expected to act as Trunk got refactored to.
12. Same as #11 for tracker_data_metadata_append_to_list() where we
free "value".
Agreed, fixed too.
13. Just wondering about the name here, shouldn't it be
GetMetadataBackup? I mean, the Turtle file is just a metadata backup
right, I don't quite know how the "Embedded" part comes into it.
+ result_set = tracker_data_manager_exec_proc (iface,
+
"GetEmbeddedMetadataBackup",
+ NULL);
This is under supervision of Ivan. I don't think it is fixed in this new
version of the patch yet. I'll let Ivan reply and handle this issue.
14. Please put _ALL_ structures and typedefs at the top of the file:
+/* TODO: URI branch path -> uri */
+
+typedef struct {
+ TrackerService *service;
+ guint32 iid_value;
+ TrackerLanguage *language;
+ TrackerConfig *config;
+} ForeachInMetadataInfo;
+
Fixed
15. Please put all static variables at the top of the file:
+ static TrackerConfig *config = NULL;
+ static TrackerLanguage *language = NULL;
Fixed
16. When linning up variables don't add extra spaces:
+void
+tracker_data_delete_service (const gchar *path,
+ const gchar *rdf_type)
+{
+ TrackerService *service;
+ const gchar *service_type;
+ guint32 service_id;
Fixed
17. I wonder if we should recursively delete here even if there is no
top level service ID found - in case there is ever something below it
left in the database. Also remember, we changed this at the code camp
in Helsinki, I am still going through the patch to see if this is a
conflict at all:
+ if (service_id != 0) {
+ tracker_data_update_delete_service (service, service_id);
+ tracker_data_update_delete_service_recursively (service, (gchar *) path);
+ tracker_data_update_delete_all_metadata (service, service_id);
+ }
Carlos replied that this is logically correct
18. Can we format these arguments according to the coding style
please:
+set_metadata (TrackerField *field, gpointer value,
ForeachInMetadataInfo *info)
Fixed
19. Every time I look at this, I think it is crack, that's why I added
the comment. I know we do this in the indexer too, but I wonder if we
should change this to something sensible with a reason too:
+ /* Throttle indexer, value 9 is from older code, why 9? */
+ throttle = tracker_config_get_throttle (info->config);
+ if (throttle > 9) {
+ tracker_throttle (info->config, throttle * 100);
+ }
Sure, if the merge happens before the fix then thw two locations.
I think we should keep the logic in this patch because what is
functionally being added is basically the exact same. Except that it's
coming from the Turtle file instead of from the FS scanner.
I try to keep patches about functionality A separated from patches that
fixes obscure weird hard coded B things.
If trunk fixes this before this merge takes place, then let me know and
I'll adapt this one to the fixed situation too. Of course.
If it gets fixed after merge, we just fix the two locations. It's not
that this is a flood of code here ;-) (just search for "why 9?").
20. I would have used for() instead of while() for clarity here:
+ GList *list;
+
+ list = value;
+
+ while (list) {
+ set_metadata (field, list->data, user_data);
+ list = list->next;
+ }
i.e.
for (list = value; list; list = list->next)
set_metadata (field, list->data, user_data);
Fixed
21. I think the API should be a bit more uniform here, i.e.
tracker_data_turtle_import_{start|stop}:
+ void tracker_data_start_turtle_import (void);
+ void tracker_data_stop_turtle_import (void);
Fixed
22. Instead of duplicating the config and language modules, can we not
just call an initialize and termination function to do all this and
wrap all static variables in a GStaticPrivate? To save memory and
also if this is done more than once per invocation of the application
it really makes sense. Right now I think TrackerConfig and
TrackerLanguage are only ever created once per application. I would
like to keep it that way if possible.
+void
+tracker_data_start_turtle_import (void)
+{
+ if (!config)
+ config = tracker_config_new ();
+ if (!language)
+ language = tracker_language_new (config);
+}
Then please make TrackerConfig and TrackerLanguage singleton classes.
23. All public functions we need variable checking, e.g.
tracker_data_delete_service()
tracker_data_replace_service()
need (for example):
g_return_if_fail (path != NULL);
Fixed
24. We should use g_debug here not g_print, and the preprocessor
comments should be on column 0 at the start of the line.
+ #ifdef TURTLE_DEBUG
+ g_print ("Q ERROR: %s\n", error->message);
+ #endif /* TURTLE_DEBUG */
+ g_error_free (error);
Removed/Fixed/Adapted/Stuffed
25. Indentation:
+ GValue id_value = { 0, };
+ GValue is_value = { 0, };
+ gint iid_value, iis_value;
+ guint32 id;
Fixed
26. API wise, what is the difference between
tracker_data_delete_service()
and
tracker_data_update_delete_service()
both look like they should do the same thing except one is the uniform
API which JÃrg worked on. That should be clearer I think. The same
namespace should be used for the _replace_ variant too.
The namespace has been fixed ('update' is added to it)
The difference is that the new one removes using the path or (soon when
the path->URI work is done) using the URI. The first removes using the
service-id. This service-id is outside of the scope of the Turtle code,
therefore is the functionality moved into the database specific pieces
and is the Turtle code just calling it that way.
27. We need spaces in here:
+ info->config = config?g_object_ref (config):NULL;
+ info->language = language?g_object_ref
(language):NULL;
Fixed
28. Bloody good catch guys :)
- if (file_iface)
+ if (file_iface) {
g_object_unref (file_iface);
- if (email_iface)
+ file_iface = NULL;
+ }
+
+ if (email_iface) {
g_object_unref (email_iface);
- if (xesam_iface)
+ email_iface = NULL;
+ }
+
+ if (xesam_iface) {
g_object_unref (xesam_iface);
+ xesam_iface = NULL;
+ }
Np
29. More spaces we don't need :)
+void
+tracker_indexer_commit_transaction (TrackerIndexer *indexer)
+{
+ stop_transaction (indexer);
+ tracker_indexer_set_running (indexer, TRUE);
+
+}
+
+void
+tracker_indexer_open_transaction (TrackerIndexer *indexer)
+{
+ tracker_indexer_set_running (indexer, FALSE);
+ start_transaction (indexer);
+}
+
Fixed
30. I would change tracker_indexer_commit_transaction() and
tracker_indexer_open_transaction() to be
tracker_indexer_transaction_*()
instead.
Fixed
31. There is no need to set mount_point to NULL after freeing it if it
is unused again.
Fixed
32. Shouldn't we do something in an "else" for the
tracker_hal_path_is_on_removable_device() for "path", like we do for
"other_path"?
+ if (tracker_hal_path_is_on_removable_device (indexer->private->hal,
+ path,
+ &mount_point,
+ NULL)) {
+ if (tracker_hal_path_is_on_removable_device (indexer->private->hal,
+ other_path,
+ NULL,
+ NULL)) {
+ tracker_removable_device_add_move (indexer,
+ mount_point,
+ path,
+ other_path,
+ tracker_service_get_name (service));
+ } else {
+ tracker_removable_device_add_removal (indexer,
+ mount_point,
+ path,
+ tracker_service_get_name (service));
+ }
+ }
Nope, the logic of that code is good
33. APIs like this really confuse me:
tracker_removable_device_add_move()
tracker_removable_device_add_removal()
Can we come up with something a bit more intuitive?
Nope. It means "add a record about a move", "add a record about a
remove".
It's not the same as updating your database as the Turtle file is an
always-append format. You append "that you removed something" and "that
you moved something". You don't go into the file and remove something,
nor do you go into the file and move something. You append that piece of
info, in the form of an "event", to the file. That's the format.
That's why it's called that way.
34. We don't need to allocate memory here:
+ values[0] = g_strdup (triple->object);
+ values[1] = NULL;
...
+
+ g_free (values[0]);
Fixed
35. All request IDs are generated before anything else. This is for
debugging purposes and so we know how many requests we have _REALLY_
received regardless of if there was some parameter check failure.
This
needs to be before any tracker_dbus_async_return_if_fail() calls:
+ request_id = tracker_dbus_get_next_request_id ();
Fixed
36. This normally comes up as a warning depending on compiler flags,
"(void)" is needed:
+static gchar *
+get_turtle_userdata_backup_filename ()
Me fixed for Ivan
37. Checking private is NULL or not is pointless here because it is
created in main() and never freed - I just noticed :) so I should fix
that. Also, we should store this value in private, since we often get
"finished" more than once from the indexer and recreating the string
here would be unnecessary:
+ if (private) {
+ return g_build_filename (private->user_data_dir,
+ "tracker-userdata-backup.ttl",
+ NULL);
+ } else {
+ g_critical ("Directories not initialized");
+ return NULL;
+ }
Under supervision of Ivan.
38. This should be fixed now (sending finished twice at the same time).
But also you should know that "finished" is sent whenever the indexer
finishes indexing ANY content from ANY file changes on the disk. So this
solution is broken anyway. It will need fixing:
+ if (counter >= 2) {
Under supervision of Ivan.
39. I would rather we are a bit more descriptive for variable names,
which callback_id this is, isn't obvious without searching further in
the code, this is in main() for trackerd:
+ gulong callback_id;
Under supervision of Ivan.
40. More spacing issues:
if (force_reindex) {
+
+ gchar *turtle_file;
+
+ turtle_file = get_turtle_userdata_backup_filename ();
Under supervision of Ivan.
41. Eeeek, what is this for? You are duplicating a lot of stuff here!
The tracker_db_manager_init() is called just after this. The
tracker_db_index_manager_init() is also called after this. The
tracker_data_manager_init() is also called after this. As I understand
it you just want to save the turtle file after an index right -
specifically the initial index? If so, then why not just check the
"initial index" state using tracker_status_get_is_first_time_index().
You could also use this in the "finished" callback.
This code looks to me like it would break a lot of things and it
shouldn't get committed:
if (force_reindex) {
+
+ gchar *turtle_file;
+
+ turtle_file = get_turtle_userdata_backup_filename ();
+
+ g_message ("Saving metadata in %s", turtle_file);
+
+ /* Init the DB stack */
+ tracker_db_manager_init (0, &is_first_time_index, TRUE);
+
+ tracker_db_index_manager_init (0,
+ tracker_config_get_min_bucket_count (config),
+ tracker_config_get_max_bucket_count (config));
+
+ file_index = tracker_db_index_manager_get_index (TRACKER_DB_INDEX_FILE);
+ email_index = tracker_db_index_manager_get_index (TRACKER_DB_INDEX_EMAIL);
+
+ tracker_data_manager_init (config, language, file_index, email_index);
+
+ tracker_backup_save (turtle_file);
+
+ /* Shutdown the DB stack */
+ tracker_data_manager_shutdown ();
+
+ tracker_db_index_manager_shutdown ();
+ tracker_db_manager_shutdown ();
+
+ g_free (turtle_file);
+
Under supervision of Ivan.
42. I don't think you want to do this ONLY on FORCE_REINDEX. Since that
is rarely the case. Although I admit I am not entirely sure what
org_freedesktop_Tracker_Indexer_restore_backup() is supposed to do from
the API name. Does it write the Turtle file to the DB? If so, then why
would you want to write metadata to the database AFTER it has just been
written to with all the latest information from the indexer? If your
intention (on the other hand) is to write initial data to the database
on a reindex before the indexer has had a chance to index the real data
it gets, then I think this is going about it the wrong way.
+ if (flags & TRACKER_DB_MANAGER_FORCE_REINDEX) {
+ g_debug ("Setting callback for crawling finish detection");
+ callback_id = g_signal_connect (private->processor, "finished",
+ G_CALLBACK (crawling_finished_cb),
+ &callback_id);
+ }
+
Under supervision of Ivan.
43. About tracker-turtle.h:
a) Why is MetadataItem public, it looks unused in the header.
b) Why doesn't it use a proper namespace?
i.e. TrackerTurtleMetadataItem
c) Why don't the callback typedefs have Tracker in the namespace?
i.e. TrackerTurtleTripleCallback
d) What happened to the coding style here?
e) Include order is incorrect.
f) Includes should have config.h
g) G_BEGIN_DECLS and G_END_DECLS are needed.
h) Why not use Glib types here? gint, gchar, etc.
i) When I saw "stmt" in the code, I had to find out what it was, where
it came from, etc. Namespacing issues there again need sortingout.
j) I wouldn't include Jamie's name in the copyright since he didn't
write it.
All fixed
44. About tracker-turtle.c:
a) The "tracker-turtle.h" include should come after all external
includes.
b) There is no need to include the same files already included in the
tracker-turtle.h.
c) In tracker_turtle_open(), you just print a critical if not
initialised. I would suggest you use g_return_val_if_fail () instead
since.
d) I would use g_fopen() not fopen() for compatibility across
platforms.
e) Is turtle->uri a uri or a path? Variable names should reflect this
accurately:
i.e. turtle->uri = raptor_new_uri ("/");
raptor_new_uri converts the string to a URI, so it's a URI and that
makes the variable name crystal clear (uri is a URI, indeed).
f) Can we use proper variable names like "count" instead of 3 letter
variable names that look awefully like another English word I dare
not
utter :)
i.e. guint cnt;
g) There are some serious code style issues there.
h) I wouldn't include Jamie's name in the copyright since he didn't
write it.
All fixed
45. About tracker-removable-device.h:
a) Include order is incorrect.
b) G_BEGIN_DECLS and G_END_DECLS are needed.
c) I wouldn't include Jamie's name in the copyright since he didn't
write it.
All fixed
d) tracker_removable_device_add_move() shouldn't this be
tracker_removable_device_move()?
No, you are not moving a removable device. You are adding info about the
moving of a file that is located on a removable device to the metadata
of the removable device. Your API proposal would make it look like as if
you re moving the removable device.
e) tracker_removable_device_add_removal() shouldn't this be
tracker_removable_device_removal()?
Same. It has nothing to do with removing the removable device. This has
to do with adding the removal of a file on the removable device to the
metadata of the removable device.
46. About tracker-removable-device.c:
a) The #ifdef HAVE_RAPTOR at the top of this file (I would expect)
should go right to the end of the file since it is so high up, but it
doesn't.
Uh? Reread the file. It's not because the ifdef is high-up that this
means that the entire file is being shadowed away with the ifdef.
I think we should decide if we are making the whole file
require that define or just parts of it and fix it accordingly.
Sure, this is being discussed on the ML atm. Leaving as-is.
b) Do you really need all those includes or is that a copy and paste
from tracker-indexer.c :)
Fixed. Seems it was indeed a C/P from tracker-indexer.c.
c) Again all typedefs need proper namespacing including the Tracker
name
otherwise it is confusing as hell when you see it somewhere.
d) Coding style needs fixing.
All fixed