Re: [Tracker] How does tracker-miner-fs work?



Hi Lado,

On Sat, Mar 19, 2016 at 1:12 PM, Lado Kumsiashvili <herrlado gmail com> wrote:
Hello,

I'm not sure if this is a bug or a feature or it depends on my setup.

I have a folder (/home/user/development) with about 61G development
files, several projects, libraries, eclipse, intelij etc.

Fist question, do you intend Tracker to index that folder? are its
search features useful to you in code? It seems that you actively
added this folder to be indexed by Tracker, so I'm mostly assuming
yes.



The problem is, after each boot the tracker-miners-fs is at 100% and is
mining and mining....

I don't think it's mining. Tracker has a bootstrap process where it
has to recursively descend through folders, in order to:

1) Set up directory monitors
2) Check whether previously indexed contents in the folder where
updated (checking file mtimes)
3) Check the mtime of the folder itself, in order to detect content
deleted, and schedule no longer existing files for removal

Out of those operations, #1 is more expensive than you'd expect,
although obviously altogether this means the bootstrap process depends
on the amount of contents to be indexed. That's just the status quo in
userspace monitoring...

And #2/#3 are not as superfluous as you might think. Usually contents
don't change between boots, but directory monitors are a finite
resource, so it's the only safety net to have contents reindexed
properly in directories that could not be monitored in previous boots.
Given the size of your development folder, I'd say it's safe to assume
that directory monitor exhaustion is feasible to be happening in your
case.

I think the explanation in
https://bugzilla.gnome.org/show_bug.cgi?id=762194#c1 also applies here
somehow...

Basically, in order to cut down on io/cpu usage during startup, you
can only do the following:
- Reducing the amount of contents to be indexed.
- Reducing the frequency of this operation. There is a
/org/freedesktop/tracker/miner/fs/crawling-interval setting that is
intended to help here (not exposed in tracker-preferences, can be
changed with eg. dconf-editor), although I notice it's not being
currently honored, something that'll have to be fixed after 1.8.0...
When fixed, and combined with disabling the enable-monitors setting,
should lead to 0% io/cpu usage in startup until the crawling interval
(in days) is met, of course at the expense of outdated indexed
content.


To test what goes on, I have killed tricker-miners-fs and started it
with -v 3 option. I left the notebook over night on. In the morning the
tracker-miner-fs was idle, as it managed to work through the complete
folder. I have killed the process ans started it again with -v 3 And it
begun again to index the entire folder from the beginning. But why,
nothing has been changed there? Another point is the settings option
"Index content in the background: Only when computer is not being used"
But it does not work as expected, at least I do not expect to see any
tracker process (and not at all at 100%), when I work with my IDE or do
some other things. So the question, what does this option mean? Why it
does not work as expected?

This boils down to the sched_setscheduler(2) syscall, with SCHED_IDLE
policy. I'd be surprised if the kernel didn't give up those cycles if
that's really necessary.



Right now

$ tracker daemon
Store:
19 Mär 2016, 13:07:49:  ✓     Store                   - Idle

Miners:
19 Mär 2016, 13:07:49:    1%  File System    (PAUSED) - Crawling
recursively directory 'file:///home/lado/development'
19 Mär 2016, 13:07:49:  ✓     Applications            - Idle
19 Mär 2016, 13:07:49:  ✗     RSS/ATOM Feeds          - Not running or
is a disabled plugin
19 Mär 2016, 13:07:49:  ✓     Userguides              - Idle
19 Mär 2016, 13:07:49:  ✓     Extractor               - Idle
19 Mär 2016, 13:07:49:  ✗     Media                   - Not running or
is a disabled plugin

but the top shows 100% of CPU for tracker-miner-fs




I have killed right now all the tracker processes and started them with

tracked daemon -s


tracker daemon -f shows

19 Mär 2016, 13:09:55:    1%  File System             - Crawling single
directory 'file:///home/lado'
19 Mär 2016, 13:09:57:    1%  File System             - Crawling single
directory 'file:///home/lado/Desktop'
19 Mär 2016, 13:09:57:    1%  File System             - Crawling
recursively directory 'file:///home/lado/development'


and tracker-miner-fs is at 100%m crawling in the huge amount if files in
the development folder.



 My Home folder is mounted with
/dev/sda5 on /home type ext4 (rw,noatime,nodiratime,data=ordered)


My Version is 1.7.4 with gnome 3.18 under gentoo

Did you modify the ignore-directories-with-content setting previously?
In 1.7.4 I added ".git" there so it'd ignore git repos by default, but
of course dconf will respect any previous user change. It should
seriously trim down the directories tracker-miner-fs crawls into for
your case, but might not be what you want.

Cheers,
  Carlos


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]