Re: [Tracker] Proposal to improve tracker-miner-fs "up-to-date" check performance

From: Michael Biebl <mbiebl gmail com>
To: "Chen, Zhenqiang" <zhenqiang chen intel com>
Cc: Carlos Garnacho <carlos lanedo com>, "tracker-list gnome org" <tracker-list gnome org>
Subject: Re: [Tracker] Proposal to improve tracker-miner-fs "up-to-date" check performance
Date: Tue, 30 Mar 2010 09:42:00 +0200

2010/3/30 Chen, Zhenqiang <zhenqiang chen intel com>:

Carlos Garnacho wrote:

As Philip said, we should take into account memory usage as well, and
keeping a hashtable for each known item is not going to be nice...
TrackerCrawler guarantees that any directory will be processed after
its parent folder, and all the items in a directory will be processed
together, so we very probably can do this on a per-folder basis.


Agree. Combining with Philip and your suggestion, I prefer the logic as:

(1) get the total count of items with SPARQL's COUNT.
    if count > 1000
         do per-folder basis query with OFFSET and LIMIT
    else
        get all items once.

For most systems like netbook or handset, there are not much items.


I would very much appreciate such a "batched" mode during initial crawling.

When I login, tracker keeps my CPU busy for  about 2-3min at 100%
during the initial crawling.
Around 50% of CPU is taken by tracker-store (my guess because of the
dbus messages).
The other half is taken by tracker-miner-fs.
I'd hope by sending larger chunks in a single dbus message, the CPU
usage of tracker-store is also going down.

Cheers,
Michael

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?

References:
- [Tracker] Proposal to improve tracker-miner-fs "up-to-date" check performance
  - From: Chen, Zhenqiang
- Re: [Tracker] Proposal to improve tracker-miner-fs "up-to-date" check performance
  - From: Carlos Garnacho
- Re: [Tracker] Proposal to improve tracker-miner-fs "up-to-date" check performance
  - From: Chen, Zhenqiang

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]