Re: [Tracker] Refactoring the filesystem miner



On 05/10/11 11:49, Carlos Garnacho wrote:
Hey hey,

Hi Carlos,

Just to tell in the ML, this is now implemented in the miner-fs-refactor
branch, timings for mtime checks have improved quite a lot there,
whereas real indexing performance stays roughly the same, as expected.

   Intel i5, SSD disk.
   indexing 468 folders, 6609 files

                  master     miner-fs-refactor
   First index    16.65s     15.4s
   Mtime checks   2.36s      0.99s


   Amd athlon, 5400rpm disk
   indexing 625 folders, 10959 files

                  master     miner-fs-refactor
   First index    178.2s     176.65s
   Mtime checks   12.39s     5.35s

Thanks for the figures here.

I have reviewed the branch and here are my comments:

1. Is there something to fix here ?
+               /* FIXME: Should avoid recursing through children,
+                * but we're not allowed to modify the tree during
+                * traversal, nor have a way to skip recursing within
+                */


2. I would add some warning here:
+               /* what are we doing with such file? should happen rarely,
+                * only with files that we've queried, but we decided not
+                * to crawl (i.e. embedded root directories, that would
+                * be processed when that root is being crawled).
+                */


3. We should put all typedefs at the top of the file (or should we) ?
+typedef struct {
+       TrackerFileNotifier *notifier;
+       GNode *cur_parent_node;
+
+       /* Canonical copy from priv->file_system */
+       GFile *cur_parent;
+} DirectoryCrawledData;


4. We are using a lot of sync calls


5. Another fixme:
+               /* FIXME: This supposedly works, but in practice
+                * won't ever happen as the parent directory
+                * wasn't being monitored if being ignored
+                */


6. Another fixme:
+                                       /* Move only the folder, and
+                                        * delete all its contents
+                                        */
+                                       /* FIXME: it doesn't */

7. The progress isn't meant to do this (i.e. go to 2%), looks like a regression: 21 Nov 2011, 12:04:43: 85% File System - Processingâ 03s remaining 21 Nov 2011, 12:04:44: 96% File System - Processingâ unknown time left 21 Nov 2011, 12:04:45: 2% File System - Processingâ unknown time left
21 Nov 2011, 12:04:45:  â     File System             - Idle


8. I think some of the miner-fs logging needs changing, there are some things which are debugging but printed as messages and visa versa:

e.g. these should be debug logs:
Tracker-Message: Flushing SPARQL buffer, reason: Queue handlers WAIT
Tracker-Message: (Sparql buffer) Finished array-update with 50 tasks

e.g. these should be messages (i.e. when they're officially in the db, I would skip the "processing" versions here but at least show the "create"/"update"/"delete" messages): (tracker-miner-fs:6782): Tracker-DEBUG: Creating new item 'file:///home/martyn/Videos/oceans13.m4v' (tracker-miner-fs:6782): Tracker-DEBUG: Creating new item 'file:///home/martyn/Videos/Inception.m4v'


9. I see a critical here when re-running miner-fs after initial index:
(tracker-miner-fs:6854): GLib-CRITICAL **: g_timer_reset: assertion `timer != NULL' failed

I have avoided _reset() in the past because I had issues with it and so use _destroy() instead. I think this was the reason why. We should really fix it in glib if this is still the case here.


10. I notice we get events for .xsession-errors now, before we *EXPLICITLY* had code to ignore that file because it just fills logs. We should fix that up: (tracker-miner-fs:6854): Tracker-DEBUG: Received monitor event:0 (G_FILE_MONITOR_EVENT_CHANGED) for file:'file:///home/martyn/.xsession-errors'


11. Currently make distcheck fails:
In file included from ../../../src/libtracker-miner/tracker-monitor.h:29:0,
                 from ../../../src/libtracker-miner/tracker-monitor.c:26:
../../../src/libtracker-miner/tracker-indexing-tree.h:30:33: fatal error: tracker-miner-enums.h: No such file or directory


12. I think we need more tests for:
â src/libtracker-miner/tracker-indexing-tree.c
â src/libtracker-miner/tracker-file-notifier.c


13. How come the branch adds:
â tests/libtracker-client/tracker-test

--

Once these are addressed I think we can merge this to master.

Thanks Carlos,

--
Regards,
Martyn

Founder and CEO of Lanedo GmbH.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]