Re: [Tracker] Re-index/re-scan on each restart?
- From: Michael Steiner <msteiner watson ibm com>
- To: martyn lanedo com
- Cc: michisteiner verizon net, tracker-list gnome org
- Subject: Re: [Tracker] Re-index/re-scan on each restart?
- Date: Wed, 01 Sep 2010 15:29:16 -0400 (EDT)
Hi Martyn,
On Wed, 01 Sep 2010 19:58:52 +0100, Martyn Russell <martyn lanedo com> said:
Thanks for a quick and detailed reply!
>> However, there is one thing which is a bit annoying: Tracker seems to
>> re-scan/re-index all my files each time i start tracker (e.g., after
>> reboot or re-login), even if in the previous run it seemed to have
>> finished the complete scan/index (i.e., tracker-status showed
>> everything as idle). As i'm indexing a fair bunch of files this takes
>> several hours (almost days) even with most aggressive scan settings.
>>
>> Is this a feature or a bug?
MR>
MR> Carlos recently fixed a bug which sounds similar to what you're
MR> describing here, see commit:
MR>
MR> 9339e32afca110fa08ac89a7161c080a9c70636e
MR>
MR> This is in master, but not in 0.8.
I must admit i haven't tried yet master/0.9 as even with a brand-new
lucid i couldn't get all build dependencies pre-built/packaged. I
guess i should plunge and hand-install them ...
MR> I will cherry-pick this for
MR> tomorrow's release. The difference in start up time is incredible, for
MR> my 20k files on this desktop machine it takes ~35s, before it was
MR> taking minutes IIRC (that time is just to check+add monitors).
Speed-up sounds good :-)
>> If the former, it is probably in the attempt to not loose any file
>> modification when tracker is not run? Of course, the best approach
>> for this would be to have a indexer which runs independent of the
>> desktop.
MR>
MR> Not sure what you mean by that?
Sorry for being too brief (even though my overall mail seemed to be
rather on the lengthy side :-)
What i meant is the following: Given that tracker runs only when the
desktop is up, there are potentially file modification which have
happened between runs. So a potential reason for a rescan after a
restart could be to verify whether such changes have happened or not.
As said below, i might trade, though, some inaccuracies to (a
hopefully greatly) reduced startup cost. However, from what you write
below i guess you also have to go unavoidable through all files to
activate the inotify interface so some form of scan always happens
with some cost (but not necessarily a re-index if the date of
modificaiton hasn't changed recently) ?
>> Short of that, it would be great to have an option which
>> allows turning off that feature (i definitely would trade the rather
>> rare missed modifications against not having a CPU and IO hog after
>> each UI login)
MR>
MR> This is possible in 0.9, there is are config options,
MR>
MR> EnableMonitors=false (in 0.8 but will still crawl)
MR> CrawlingInterval=0 (in 0.9, set to -1 to disable crawling entirely)
I guess the second argument would give me the
monitor-but-don't-(re)crawl i was looking for? I guess reasons to
manually upgrade the build dependencies :-)
MR> The later option above allows application specific indexing only so
MR> the crawler doesn't burn any CPU time, however, it isn't the default
MR> or recommended since you then rely on applications to keep data up to
MR> date.
MR>
>> If it's a bug, following some observations after looking at the
>> log-files in ~/.local/share/tracker:
>>
>> - tracker-store.log is empty
MR>
MR> All logs will be if Verbosity is < 1 in their respective .cfg files in
MR> $HOME/.config/tracker.
All cfgs had verbosity=0. In this case i wasn't concerned about having
empty logs, but just as side-info that tracker-store didn't see any
errors.
>> - tracker-miner-fs.log has by far the most messages (several
>> hunderts), half of them are of the flavor of below
>>
>> 01 Sep 2010, 08:28:13: Tracker-Critical **: Could not execute sparql:
>> Unable to insert multiple values for subject
>> `urn:uuid:0c147350-e9fe-9b16-ced3-2564b21ef9fa' and single valued
>> property `dc:rights' (old_value:
>> 'http://creativecommons.org/licenses/by/2.5/', new value:
>> 'http://www.apache.org/licenses/LICENSE-2.0')
MR>
MR> Those should be fixed.
You mean fixed in 0.8.16? (Aa mentioned, i'm not running 0.9 yet)
MR> Could you turn the verbosity up to 3 and create
MR> a new bug report with the file that causes this? (if possible)
Ok, i'll change the config and hope my log-files down overwhelm my
disk :-) [as mentioend, i index a lot of files with the index
currently about 4GB ...]
>> PS: when i installed it, i also run ``make check'' and after i
>> figured out that i had to do a ``cd `/bin/pwd`'' to please some tests
>> it all worked fine with the exception of the
>> ``tracker-password-provider-test'' test which didn't run as it
>> expected some pwd files pre-configured which i didn't have (and didn't
>> immediately could figure out how to create)
MR>
MR> For 0.8? or 0.9? This should be fixed I would say.
This was for 0.8.16. Haven't run 0.9.* yet for above mentioned
reasons.
-michael-
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]