Re: [Tracker] Clues regarding improving performance of tracker-store
- From: Philip Van Hoof <philip codeminded be>
- To: Ivan Frade <ivan frade gmail com>, Jonatan Pålsson <jonatan palsson pelagicore com>
- Cc: "tracker-list gnome org" <tracker-list gnome org>
- Subject: Re: [Tracker] Clues regarding improving performance of tracker-store
- Date: Sat, 13 Jul 2013 09:29:55 +0200
Ivan Frade schreef op 12/07/2013 18:52:
Hi guys,
I plan to write a more detailed guide to improving insert performance
when I have more time. This weekend I'm very busy with moving from my
gf's appartment to my newly renovated house ;), so i'll keep it short.
Important is to use the INSERT OR REPLACE feature instead of
DELETE+INSERT, another thing you can do is increase the LRU cache size
and tweak the various buffer sizes we have in tracker-data-update.c.
Finally changing the ontology could help. But because of the decomposed
schema you wouln't touch tables of ontology domains that aren't related
to your insert of data a lot.
Except indeed when there are hierarchies. So if a total ontology rewrite
is fine, try to reduce inheritance. Aggregation over inheritance ('has
a' instead of 'is a') in the ontology will often be faster, but it also
depends on a variety of things for which you should study the insert
queries that we generate for a given insert v. ontology situation.
Aggregation will often make your SELECT queries more complicated (and if
you need the data, probably slower too). We optimized first for read
speed, then write speed.
The inserting, updating and deleting on the SQL layer itself is by the
way not the only thing that influences insert performance. The SPARQL
parsing, buffering and grouping into transations among other things
(like IPC overhead) also play a role. Although I must say that after so
many years of being plagued by Nokians who didn't like Tracker because
it was Not Invented Here (not by their own team) and somewhat enforced
upon them, we did ensure that it's really really optimized (and teams
where challenged to find performance improvements and open bugs on them,
instead of making empty arguments that it's not). It would surprise me
if you'd find a single strdup or malloc that shouldn't be there, for
example. But I'll be more than happy if you eliminate one.
Next you have indexes and domain specific indexes that'll slow down
inserting. And you have the signals on changes that you can turn off on
a class, which will have a memory usage and performance impact while
inserting (not doing something is always faster than doing something).
If you don't need FTS, then disabling FTS should make a huge performance
improvement. For the same reason (a lot of things wont be done anymore,
which is always faster than doing them. But FTS is also a nice feature
to have. So make your choice).
Most of my ideas for improving performance are domain specific (cut
Tracker to do exactly what you need). Philip wrote also some ideas to
reduce the database file size recently in this mailing list.
A lot of those ideas need to be implemented. Some of them aren't easy.
And yet others are in Jürg's mind waiting to be unleashed on the world.
And some in my mind, but then I'm sure Jürg thought about it too already ;-)
Kind regards,
Philip
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]