Re: [Tracker] Clues regarding improving performance of tracker-store

From: Philip Van Hoof <philip codeminded be>
To: Ivan Frade <ivan frade gmail com>, Jonatan Pålsson <jonatan palsson pelagicore com>
Cc: "tracker-list gnome org" <tracker-list gnome org>
Subject: Re: [Tracker] Clues regarding improving performance of tracker-store
Date: Sat, 13 Jul 2013 09:29:55 +0200

Ivan Frade schreef op 12/07/2013 18:52:

Hi guys,

I plan to write a more detailed guide to improving insert performancewhen I have more time. This weekend I'm very busy with moving from mygf's appartment to my newly renovated house ;), so i'll keep it short.

Important is to use the INSERT OR REPLACE feature instead ofDELETE+INSERT, another thing you can do is increase the LRU cache sizeand tweak the various buffer sizes we have in tracker-data-update.c.

Finally changing the ontology could help. But because of the decomposedschema you wouln't touch tables of ontology domains that aren't relatedto your insert of data a lot.

Except indeed when there are hierarchies. So if a total ontology rewriteis fine, try to reduce inheritance. Aggregation over inheritance ('hasa' instead of 'is a') in the ontology will often be faster, but it alsodepends on a variety of things for which you should study the insertqueries that we generate for a given insert v. ontology situation.Aggregation will often make your SELECT queries more complicated (and ifyou need the data, probably slower too). We optimized first for readspeed, then write speed.

The inserting, updating and deleting on the SQL layer itself is by theway not the only thing that influences insert performance. The SPARQLparsing, buffering and grouping into transations among other things(like IPC overhead) also play a role. Although I must say that after somany years of being plagued by Nokians who didn't like Tracker becauseit was Not Invented Here (not by their own team) and somewhat enforcedupon them, we did ensure that it's really really optimized (and teamswhere challenged to find performance improvements and open bugs on them,instead of making empty arguments that it's not). It would surprise meif you'd find a single strdup or malloc that shouldn't be there, forexample. But I'll be more than happy if you eliminate one.

Next you have indexes and domain specific indexes that'll slow downinserting. And you have the signals on changes that you can turn off ona class, which will have a memory usage and performance impact whileinserting (not doing something is always faster than doing something).

If you don't need FTS, then disabling FTS should make a huge performanceimprovement. For the same reason (a lot of things wont be done anymore,which is always faster than doing them. But FTS is also a nice featureto have. So make your choice).

Most of my ideas for improving performance are domain specific (cutTracker to do exactly what you need). Philip wrote also some ideas toreduce the database file size recently in this mailing list.

A lot of those ideas need to be implemented. Some of them aren't easy.And yet others are in Jürg's mind waiting to be unleashed on the world.And some in my mind, but then I'm sure Jürg thought about it too already ;-)


Kind regards,

Philip

Follow-Ups:
- Re: [Tracker] Clues regarding improving performance of tracker-store
  - From: Jonatan Pålsson
- Re: [Tracker] Clues regarding improving performance of tracker-store
  - From: Martyn Russell

References:
- [Tracker] Clues regarding improving performance of tracker-store
  - From: Jonatan Pålsson
- Re: [Tracker] Clues regarding improving performance of tracker-store
  - From: Ivan Frade

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]