[Tracker] Re-index/re-scan on each restart?



Hi,

After switching from an old RHEL to Ubuntu Lucid, i finally dumped
Google Desktop and went for tracker (0.8.16).  So far i'm quite happy
with what it does and from the architecture and open apis i'm
confident that it will get even better.

However, there is one thing which is a bit annoying: Tracker seems to
re-scan/re-index all my files each time i start tracker (e.g., after
reboot or re-login), even if in the previous run it seemed to have
finished the complete scan/index (i.e., tracker-status showed
everything as idle).  As i'm indexing a fair bunch of files this takes
several hours (almost days) even with most aggressive scan settings.

Is this a feature or a bug?

If the former, it is probably in the attempt to not loose any file
modification when tracker is not run? Of course, the best approach
for this would be to have a indexer which runs independent of the
desktop. Short of that, it would be great to have an option which
allows turning off that feature (i definitely would trade the rather
rare missed modifications against not having a CPU and IO hog after
each UI login)

If it's a bug, following some observations after looking at the
log-files in ~/.local/share/tracker: 

- tracker-store.log is empty

- tracker-extract.log contains a warning

      01 Sep 2010, 09:53:06: Tracker-Warning **: Could not load module 'libextract-mplayer.so': 
/usr/lib/tracker-0.8/extract-modules/libextract-mplayer.so: undefined symbol: tracker_extract_guess_date

   which seems due to libextract-mplayer (and libextract-totem) using
   a non-existing function ``tracker_extract_guess_date'' (rather than
   presumably the existing ``tracker_date_guess'') and is unlikely to
   have an impact here.

   I also see a few warnings along the lines of

       01 Sep 2010, 09:58:05: Tracker-Warning **: Couldn't convert 14848 bytes from CP1252 to UTF-8: Invalid 
byte sequence in conversion input

   but this is probably also not relevant for this problem?

- tracker-miner-fs.log has by far the most messages (several
  hunderts), half of them are of the flavor of below

    01 Sep 2010, 08:28:13: Tracker-Critical **: Could not execute sparql: Unable to insert multiple values 
for subject `urn:uuid:0c147350-e9fe-9b16-ced3-2564b21ef9fa' and single valued property `dc:rights' 
(old_value: 'http://creativecommons.org/licenses/by/2.5/', new value: 
'http://www.apache.org/licenses/LICENSE-2.0')


   (3 quarter of them include  http://www.apache.org/licenses/LICENSE-2.0, for the rest i didn't spot a 
pattern)


   Being marked critical, maybe this is causing the re-index?


Any insights are welcome. Thanks!

-michael-


PS: when i installed it, i also run ``make check'' and after i
figured out that i had to do a ``cd `/bin/pwd`'' to please some tests
it all worked fine with the exception of the
``tracker-password-provider-test'' test which didn't run as it
expected some pwd files pre-configured which i didn't have (and didn't
immediately could figure out how to create)



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]