Re: [Tracker] Merging Turtle branch to Trunk



El mar, 09-12-2008 a las 18:17 +0100, ext Philip Van Hoof escribiÃ:
Hi there,

This is a new version of the patch.

Garnacho: You want to review the part where I'm changing your precious 
src/tracker-indexer/tracker-module-metadata.c at line 1016 in this diff.

Ivan: Continue committing in the "turtle" branch. I'll cope by merging
your changes into the next patch and using some SVN command-line tricks.

Martyn: Re-review ;-). I commented inline (see lower)

All you guys: Have fun



4. Do we need to check S.Enabled here and/or the Volumes table at all?
Also, isn't it more logical to use "M" instead of "F" for the
metadata 

[CUT SQLite stored statements]


About this specific issue, I think I saw Ivan commit something about
these queries in the turtle branch. I also noticed some differences in
the original merge patch, and this new one, about these queries.

I'll let Ivan reply if he wants to about this.

 Renamed the alias for the table to M. There is no need of checking
"Enable": we save all the user metadata and ignore the files that don't
exists when we restore the data again.


13. Just wondering about the name here, shouldn't it be 
GetMetadataBackup? I mean, the Turtle file is just a metadata backup 
right, I don't quite know how the "Embedded" part comes into it.

+       result_set = tracker_data_manager_exec_proc (iface,
+
"GetEmbeddedMetadataBackup",
+                                                    NULL);

This is under supervision of Ivan. I don't think it is fixed in this new
version of the patch yet. I'll let Ivan reply and handle this issue.

 Yes, the name of the procedure was totally wrong, because it is saving
NON-embedded metadata. Renamed to GetUserMetadataBackup.


17. I wonder if we should recursively delete here even if there is no 
top level service ID found - in case there is ever something below it 
left in the database. Also remember, we changed this at the code camp 
in Helsinki, I am still going through the patch to see if this is a 
conflict at all:

+       if (service_id != 0) {
+               tracker_data_update_delete_service (service, service_id);
+               tracker_data_update_delete_service_recursively (service, (gchar *) path);
+               tracker_data_update_delete_all_metadata (service, service_id);
+       }

Carlos replied that this is logically correct

 We should check if the service is a folder before calling
_delete_service_recursively. That function is highly inefficient and was
causing the issue of "deletions are very slow".


36. This normally comes up as a warning depending on compiler flags, 
"(void)" is needed:

+static gchar *
+get_turtle_userdata_backup_filename ()

Me fixed for Ivan

 Thanks :) 

37. Checking private is NULL or not is pointless here because it is 
created in main() and never freed - I just noticed :) so I should fix 
that. Also, we should store this value in private, since we often get 
"finished" more than once from the indexer and recreating the string 
here would be unnecessary:

+       if (private) {
+               return g_build_filename (private->user_data_dir,
+                                        "tracker-userdata-backup.ttl",
+                                        NULL);
+       } else {
+               g_critical ("Directories not initialized");
+               return NULL;
+       }

Under supervision of Ivan.

 We are recreating this string only two times, so i think there is no
need to keep it in memory all the time. Anyway, i added the string into
private.


38. This should be fixed now (sending finished twice at the same time). 
But also you should know that "finished" is sent whenever the indexer 
finishes indexing ANY content from ANY file changes on the disk. So this 
solution is broken anyway. It will need fixing:

+       if (counter >= 2) {

Under supervision of Ivan.

 For some reason it is broken again :/ We are receiving two "Finished"
signals.


39. I would rather we are a bit more descriptive for variable names, 
which callback_id this is, isn't obvious without searching further in 
the code, this is in main() for trackerd:

+       gulong                      callback_id;

Under supervision of Ivan.

Renamed to "restore_cb_id"

40. More spacing issues:

        if (force_reindex) {
+
+               gchar              *turtle_file;
+
+               turtle_file = get_turtle_userdata_backup_filename ();

Under supervision of Ivan.

 That line doesn't exist due the changes in #37


41. Eeeek, what is this for? You are duplicating a lot of stuff here! 
The tracker_db_manager_init() is called just after this. The 
tracker_db_index_manager_init() is also called after this. The 
tracker_data_manager_init() is also called after this. As I understand 
it you just want to save the turtle file after an index right - 
specifically the initial index? If so, then why not just check the 
"initial index" state using tracker_status_get_is_first_time_index(). 
You could also use this in the "finished" callback.

This code looks to me like it would break a lot of things and it 
shouldn't get committed:

        if (force_reindex) {
+
+               gchar              *turtle_file;
+
+               turtle_file = get_turtle_userdata_backup_filename ();
+
+               g_message ("Saving metadata in %s", turtle_file);
+
+               /* Init the DB stack */
+               tracker_db_manager_init (0, &is_first_time_index, TRUE);
+
+               tracker_db_index_manager_init (0,
+                                              tracker_config_get_min_bucket_count (config),
+                                              tracker_config_get_max_bucket_count (config));
+               
+               file_index = tracker_db_index_manager_get_index (TRACKER_DB_INDEX_FILE);
+               email_index = tracker_db_index_manager_get_index (TRACKER_DB_INDEX_EMAIL);
+               
+               tracker_data_manager_init (config, language, file_index, email_index);
+               
+               tracker_backup_save (turtle_file);
+
+               /* Shutdown the DB stack */
+               tracker_data_manager_shutdown ();
+               
+               tracker_db_index_manager_shutdown ();
+               tracker_db_manager_shutdown ();
+
+               g_free (turtle_file);
+

Under supervision of Ivan.

 I wrote a mail explaining the why of these changes. And really, the
functionality of all the backup stuff is here. It starts the db start
(for reading), read the metadata, save it, and close completely the
stack. Then the usual reindex process continues.



42. I don't think you want to do this ONLY on FORCE_REINDEX. Since that 
is rarely the case. Although I admit I am not entirely sure what 
org_freedesktop_Tracker_Indexer_restore_backup() is supposed to do from 
the API name. Does it write the Turtle file to the DB? If so, then why 
would you want to write metadata to the database AFTER it has just been 
written to with all the latest information from the indexer? If your 
intention (on the other hand) is to write initial data to the database 
on a reindex before the indexer has had a chance to index the real data 
it gets, then I think this is going about it the wrong way.

+       if (flags & TRACKER_DB_MANAGER_FORCE_REINDEX) {
+               g_debug ("Setting callback for crawling finish detection");
+               callback_id = g_signal_connect (private->processor, "finished",
+                                               G_CALLBACK (crawling_finished_cb),
+                                               &callback_id);
+       }
+

Under supervision of Ivan.

Same considerations. 

I am thinking that this mechanism is broken if we call the "restart with
reindex" from the preferences window, because in that case tracker
remove the DB files and restart.

Ivan







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]