Re: [Tracker] Releasing Tracker 0.6.90



Jamie McCracken wrote:
On Fri, 2009-01-30 at 09:46 +0000, Martyn Russell wrote:
Jamie McCracken wrote:
But just when I was happy about one issue being fixed, I noticed that
performance on removal is awful again:
Unpacking linux-2.6.28.tar.bz2 in $HOME takes tracker (r2862) around
20 minutes to index.
When I rm -rf linux-2.6.28/, it takes over an hour, with my cpu
constantly busy (i.e. max speed at 100%) :-/
yeah thats something that needs investigation prior to release - thanks
for spotting it
Carlos is looking into this right now.

As it stands, we will be releasing next week unless there are other issues noticed before then that need addressing.


In addition to deletion of folders :

Other issues: (* mean must fix prior to release)

1*) directory crawling is too cpu intensive and eats too much memory
compared to 0.6.6. Old  version did not queue up thousands of files but
instead queued up directories where mtime indicated it needed updating.
So memory wise only all directoires (and subdirectories) path were held
in memory.

How much memory is too much? I would say the 30 Mb or so is actually not bad, I agree, 3 Mb is much better, but if it is not doable with the current design, then we shouldn't block on this.

The design here is *completely* different. You are comparing apples to pears here. The old version queued directories and those directories were then crawled YES, but that's not how it currently is designed. Right now, we send all files and directories and the indexer does the filtering based on the database mtime information. Since you have to crawl all directories anyway, the only difference here is that the queues are bigger.

This is something we plan to improve on in the near future, by moving the crawling to the indexer. This is not going to get done before this release - unless you want to wait another 6+ months before we release. This change of design would significantly reduce the memory footprint here. Right now, changing this is not an option.

As for the CPU use, this is something that is very difficult to gauge because on my two machines here (one being a laptop) I don't even notice Tracker is running. The way I see it:

1. I could be getting more speed out of *MY* tracker.
2. It might be running slowly on other people's machines.

This is what the throttle is here for. It is set to 0 by default. This is what we use:

        /* Get the throttle, add 5 (minimum value) so we don't do
         * nothing and then multiply it by the factor given
         */
        throttle  = tracker_config_get_throttle (config);
        throttle += 5;
        throttle *= multiplier;

        if (throttle > 0) {
                g_usleep (throttle);
        }

Bare in mind, before, the += 5 part was not in there in the last release, this means we generally throttle slightly longer than we did previously.

We should really be finding out *why* this happens by:

1. Profiling the software on the machine seeing the slowdown to find out where the bottleneck is.

2. Perhaps adding some more throttling.

2*) Moving a file into another directory caused the file to no longer be
searchable. Also when renaming a directory a search on the new name only
finds the changed directory name but none of its files or subfolders.
Tracker should always return a hit if part of the path of a file
matches! (as per 0.6.6)

Is the other directory the file is moved to monitored?

OK, we will look into the rename issue.

3*) TST still shows email category twice - Im not sure of cause but i am
investigating this one

OK, great thanks.

4*) Disk Io still quite heavy - searching in TST can time out dbus wise
during indexing. It tries to pause the indexer but takes several minutes
- i dont suppose there is much we can do here? If not suggest upping
default throttle levels to minimize the occurance of this

Hmm, it should not be taking minutes. It should be doing the following in the indexer:

- COMMIT (on any current transactions)
- Pause indexing

If this is not happening then we are spending two long on IO with the database perhaps? This should be investigated.

5) Applet takes a long time to update its status if started after
trackerd.

Yea, it is much better than it was. This can be improved on I suppose.

--
Regards,
Martyn



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]