Re: [Tracker] miner-fs: Placing monitors on directories takes way too much time




Hum, yes, that's the idea actually; I don't understand why you say it's
a regression. CREATED and UPDATED events will get merged in the SPARQL
buffer, and the buffer will be flushed (commited to the store) 
if any of

Regression means it takes more time than other branches. 
But I expect it should take less time than other branches.


Well, there may be a reason for that.

When trying to determine which was the cause of the speedup, I did
several tests trying to measure the amount of time spent by
tracker-store while actually doing the SQL update in the DB; and results
showed that merging the SQLs also in the same SQL-update in the SQLite
DB was giving some improvement.

See this example, for 5075 images, using the "miner-fs-merge-updates"
branch (the 'with patch' results):
* git master, Mining: 343.13s (sqlite updates take ~90s)
* "miner-fs-merge-updates" branch, Mining: 248.48s (sqlite updates take
~70s)

This means that there was a speedup from 343s to 248s, which is around
95s; and from that improvement, 20s was actually coming from SQLite
while inserting the data in the database. So, a single insert of 100
items in the DB is faster than 100 inserts of 1 element. Thus, the
speedup was coming from several factors, not only dbus overhead but also
SQLite insert time.

Note anyway that the effect of SQLite insert time is really much less
than the actual overhead due to dbus (the example above was actually the
worst one in terms of % of time in SQLite inserts).

With the new "miner-fs-refactor-multi-insert" branch, we are merging
SPARQL updates in a single dbus connection; but still those updates are
then SQL-inserted one by one in the SQLite database (IIRC, pvanhoof?),
not all of them together as in the original "miner-fs-merge-updates"
branch. So, following the example above, we should actually be seeing an
improvement of 95-20=75s with this new branch (will make some tests this
week with the new branch, don't have real numbers yet). Yes, its slower
than the original branch, but now we are actually getting error
reporting per-update, instead of single-error for all merged updates.

As soon as I have some custom numbers with the new branch will post them
here.

-- 
Aleksander




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]