Hi there, I've done a performance analysis and comparison of Tracker's RDF store versus E-D-S (Evolution Data Server). I'll first make a summary of what I think is necessary for readers to know and understand because both Tracker's RDF store and E-D-S are in fact different products service a similar but different purpose. o. The VCard that I'm testing with for E-D-S isn't as complex as the Nepomuk-ontology based contact that the tests are saving in the RDF example for Tracker. This means that this performance test isn't perfectly compared. But I'm not a VCard expert myself. And results were rather clear (for me) early on. I have included the source code of both test softwares in attachment so that you can match the complexity to produce better numbers. o. Given that E-D-S is a single-purpose store for PIM data we expected E-D-S to perform much better than Tracker's RDF store. Tracker is of course not a single-purpose store. It can store all within the realm of Nepomuk's ontology (which has a lot more application and use-case domains than PIM) It is not the case that E-D-S performs much better, as the report below will illustrate. o. I personally expected E-D-S to scale to large amounts of contacts. But a simple loop of adding 2x 1000 contacts makes Evolution's UI go flat on its face. The UI doesn't respond anymore at all. The only thing that helps is a evolution --force-shutdown. The Evolution UI also makes the entire Desktop unresponsive. This is important for for example a platform like MeeGo: a version of the Evolution UI _does_ run on MeeGo devices (making it part of E-D-S as solution on aforementioned platform -- the UI can and often will be running, my poor batteries). o. For a very very long time after the test.c 1000 contacts loop has finished is Evolution running at 95% CPU. Doing who knows what (draining massive amounts of power, in any case) This makes me conclude that the API is cheating and returning earlier than allowed. Maybe not everything is finished, and I wonder what would happen if the system would crash. Would all data be guaranteed to be persistently stored on the storage hardware? Tracker's RDF store guarantees that at return of the API the data will be stored (it has a journal for this). o. After+while the data has been/is being entered into the store can Tracker's RDF store perform more complex queries and at the same time it can allow queries that cross multiple use-case and application domains (deeply link contacts to IM, E-mail, Photos, Videos, Events, GEO locations, and many more classes _and_ query using the links). This is something that E-D-S, given that it's a single-purpose store, can't offer at this moment. Nor does E-D-S provide a rich query lang. like SPARQL for this purpose. E-D-S is more a get and set system with a rather flat query language. We didn't test query performance this time. Given the huge differences in capabilities it's probably not an interesting comparison the make between Tracker's RDF store and E-D-S. With Tracker's GraphUpdated you can do a auto-update-model comparable to EBookView's API. QSparql have such a model for Qt implemented. o. Both E-D-S and Tracker's RDF store have at this moment the same or comparable security capabilities. With E-D-S you can have multiple addressbooks, with Tracker's RDF store you can have either GRAPHs or grouping with a nie:DataSource; or any other typical Nepomuk technique for this purpose (like adding a addressbook class, if necessary -- there are plenty of ways for this). A few somewhat larger tests: ---------------------------- o. Tracker's RDF store with its newest INSERT OR REPLACE support: INSERT OR REPLACE is not yet available in master). For reference is the original that doesn't use INSERT OR REPLACE also being ran. The support for INSERT OR REPLACE is in branch sparql-update. More info on INSERT OR REPLACE here: http://pvanhoof.be/blog/index.php/2011/03/09/a-replace-extension-for-trackers-sparqls-update pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ ./test-insert-or-replace REPLACE: 1000 contacts: 12.997943 ORIGINAL: 1000 contacts: 13.850410 pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ ./test-insert-or-replace REPLACE: 1000 contacts: 15.442745 ORIGINAL: 1000 contacts: 27.525495 pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ ./test-insert-or-replace REPLACE: 1000 contacts: 17.257888 ORIGINAL: 1000 contacts: 27.712315 pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ The GraphUpdated stuff was cleanly being emitted, dbus-daemon was not being flooded as GraphUpdated is engineered for these volumes of data deltas. More info on how GraphUpdated works here: http://pvanhoof.be/blog/index.php/2010/08/24/trackers-new-class-signal-system-being-developed o. And now E-D-S: I had to kill Evolution's UI because after the second run was my entire Desktop computer completely unresponsive and was Evolution's shell using 95% CPU. I do mention this because for just 2x 1000 contacts is this behaviour in my opinion unacceptable. I know I will receive a lot of typical hate for saying this, but the Evolution team shouldn't feel too proud of this. In my opinion should NO amount of input make it possible to let Evolution's UI hang. pvanhoof lors:~$ gcc test.c `pkg-config libebook-1.2 --cflags --libs` pvanhoof lors:~$ ./a.out EDS 1000 contacts: 10.604454 pvanhoof lors:~$ ./a.out EDS 1000 contacts: 11.362855 pvanhoof lors:~$ evolution --force-shutdown No response from Evolution -- killing the process That "No response from Evolution" should illustrate how bad the situation actually became; the process is not even responding to a private IPC asking it to cleanly shutdown. Forcing the tool to use the kernel's KILL instead. Yikes. Some smaller tests (yields smaller differences too): ---------------------------------------------------- Luckily didn't Evolution die and crash on these smaller tests ... pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ ./test-insert-or-replace REPLACE: 100 contacts: 1.348496 ORIGINAL: 100 contacts: 2.839032 pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ cd /home/pvanhoof/ pvanhoof lors:~$ gcc test.c `pkg-config libebook-1.2 --cflags --libs` pvanhoof lors:~$ ./a.out EDS 100 contacts: 1.000067 pvanhoof lors:~$ ./a.out EDS 100 contacts: 0.886793 pvanhoof lors:~$ ./a.out EDS 100 contacts: 0.902376 pvanhoof lors:~$ cd ~/repos/gnome/tracker/master/tests/functional-tests/ipc pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ ./test-insert-or-replace REPLACE: 100 contacts: 1.375554 ORIGINAL: 100 contacts: 2.631252 pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ ./test-insert-or-replace REPLACE: 100 contacts: 1.448647 ORIGINAL: 100 contacts: 2.700024 pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ ./test-insert-or-replace REPLACE: 100 contacts: 1.400238 ORIGINAL: 100 contacts: 2.787517 pvanhoof lors:~/repos/gnome/tracker/master/tests/functional-tests/ipc$ I'd love to receive performance numbers from different people, using different circumstances. Please feel free to use Tracker's mailing list for publicizing your results. Cheers, Philip -- Philip Van Hoof freelance software developer Codeminded BVBA - http://codeminded.be
Attachment:
test.c
Description: Text Data
Attachment:
test-insert-or-replace.vala
Description: Text Data