Re: [Tracker] Re-index/re-scan on each restart?



On 01/09/10 20:29, Michael Steiner wrote:
Hi Martyn,

On Wed, 01 Sep 2010 19:58:52 +0100, Martyn Russell<martyn lanedo com>  said:

Thanks for a quick and detailed reply!

No problem,

    MR>  This is in master, but not in 0.8.

I must admit i haven't tried yet master/0.9 as even with a brand-new
lucid i couldn't get all build dependencies pre-built/packaged.  I
guess i should plunge and hand-install them ...

It's not so necessary, 0.8.x should be good enough unless you want to use the latest and greatest :)

    MR>  Not sure what you mean by that?

Sorry for being too brief (even though my overall mail seemed to be
rather on the lengthy side :-)

What i meant is the following: Given that tracker runs only when the
desktop is up, there are potentially file modification which have
happened between runs.  So a potential reason for a rescan after a
restart could be to verify whether such changes have happened or not.
As said below, i might trade, though, some inaccuracies to (a
hopefully greatly) reduced startup cost.  However, from what you write
below i guess you also have to go unavoidable through all files to
activate the inotify interface so some form of scan always happens
with some cost (but not necessarily a re-index if the date of
modificaiton hasn't changed recently) ?

I see. We do 2 things here. We store the file and its mtime in the database. We also use inotify to monitor changes during the day to day running of the computer. When the miner-fs starts, it sets up these monitors and checks mtimes against the database to make sure nothing changed. If the mtime is different, we reindex a file or directory. The inotify part is possibly one of the weakest parts of Tracker right now, as it doesn't scale with large numbers of directories and uses a lot of resources to work with any success generally. With FANotify coming up in newer kernels, we are looking to improve this situation in the near future.

    MR>    EnableMonitors=false (in 0.8 but will still crawl)
    MR>    CrawlingInterval=0 (in 0.9, set to -1 to disable crawling entirely)

I guess the second argument would give me the
monitor-but-don't-(re)crawl i was looking for?  I guess reasons to
manually upgrade the build dependencies :-)

Actually, without the Crawling you can't have monitors, perhaps that should be documented a bit clearer but if we don't crawl, we can't set up monitors (as you have to crawl to set up monitors on directories found).

    MR>  The later option above allows application specific indexing only so
    MR>  the crawler doesn't burn any CPU time, however, it isn't the default
    MR>  or recommended since you then rely on applications to keep data up to
    MR>  date.
    MR>
    >>  If it's a bug, following some observations after looking at the
    >>  log-files in ~/.local/share/tracker:
    >>
    >>  - tracker-store.log is empty
    MR>
    MR>  All logs will be if Verbosity is<  1 in their respective .cfg files in
    MR>  $HOME/.config/tracker.

All cfgs had verbosity=0. In this case i wasn't concerned about having
empty logs, but just as side-info that tracker-store didn't see any
errors.

This is the default, the logging can add an additional performance hit, especially for lower end machines. Most people don't need to worry about it.

    >>  - tracker-miner-fs.log has by far the most messages (several
    >>  hunderts), half of them are of the flavor of below
    >>
    >>  01 Sep 2010, 08:28:13: Tracker-Critical **: Could not execute sparql:
    >>  Unable to insert multiple values for subject
    >>  `urn:uuid:0c147350-e9fe-9b16-ced3-2564b21ef9fa' and single valued
    >>  property `dc:rights' (old_value:
    >>  'http://creativecommons.org/licenses/by/2.5/', new value:
    >>  'http://www.apache.org/licenses/LICENSE-2.0')
    MR>
    MR>  Those should be fixed.

You mean fixed in 0.8.16? (Aa mentioned, i'm not running 0.9 yet)

I mean it looks like a bug we should fix.

    MR>  Could you turn the verbosity up to 3 and create
    MR>  a new bug report with the file that causes this? (if possible)

Ok, i'll change the config and hope my log-files down overwhelm my
disk :-) [as mentioend, i index a lot of files with the index
currently about 4GB ...]

The logs shouldn't be so bad.

    >>  PS: when i installed it, i also run ``make check'' and after i
    >>  figured out that i had to do a ``cd `/bin/pwd`'' to please some tests
    >>  it all worked fine with the exception of the
    >>  ``tracker-password-provider-test'' test which didn't run as it
    >>  expected some pwd files pre-configured which i didn't have (and didn't
    >>  immediately could figure out how to create)
    MR>
    MR>  For 0.8? or 0.9? This should be fixed I would say.

This was for 0.8.16.  Haven't run 0.9.* yet for above mentioned
reasons.

OK, I just thought I would ask as you say you use what comes with Ubuntu but go on to say you built Tracker - so I wondered if you actually tried 0.9.

--
Regards,
Martyn



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]