Re: [Tracker] Clues regarding improving performance of tracker-store



Ivan Frade schreef op 12/07/2013 18:52:

Hi guys,

I plan to write a more detailed guide to improving insert performance when I have more time. This weekend I'm very busy with moving from my gf's appartment to my newly renovated house ;), so i'll keep it short.

Important is to use the INSERT OR REPLACE feature instead of DELETE+INSERT, another thing you can do is increase the LRU cache size and tweak the various buffer sizes we have in tracker-data-update.c.

Finally changing the ontology could help. But because of the decomposed schema you wouln't touch tables of ontology domains that aren't related to your insert of data a lot.

Except indeed when there are hierarchies. So if a total ontology rewrite is fine, try to reduce inheritance. Aggregation over inheritance ('has a' instead of 'is a') in the ontology will often be faster, but it also depends on a variety of things for which you should study the insert queries that we generate for a given insert v. ontology situation. Aggregation will often make your SELECT queries more complicated (and if you need the data, probably slower too). We optimized first for read speed, then write speed.

The inserting, updating and deleting on the SQL layer itself is by the way not the only thing that influences insert performance. The SPARQL parsing, buffering and grouping into transations among other things (like IPC overhead) also play a role. Although I must say that after so many years of being plagued by Nokians who didn't like Tracker because it was Not Invented Here (not by their own team) and somewhat enforced upon them, we did ensure that it's really really optimized (and teams where challenged to find performance improvements and open bugs on them, instead of making empty arguments that it's not). It would surprise me if you'd find a single strdup or malloc that shouldn't be there, for example. But I'll be more than happy if you eliminate one.

Next you have indexes and domain specific indexes that'll slow down inserting. And you have the signals on changes that you can turn off on a class, which will have a memory usage and performance impact while inserting (not doing something is always faster than doing something).

If you don't need FTS, then disabling FTS should make a huge performance improvement. For the same reason (a lot of things wont be done anymore, which is always faster than doing them. But FTS is also a nice feature to have. So make your choice).


Most of my ideas for improving performance are domain specific (cut Tracker to do exactly what you need). Philip wrote also some ideas to reduce the database file size recently in this mailing list.

A lot of those ideas need to be implemented. Some of them aren't easy. And yet others are in Jürg's mind waiting to be unleashed on the world. And some in my mind, but then I'm sure Jürg thought about it too already ;-)

Kind regards,

Philip



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]