Hi, this is a mail that I once sent to a few people at Nokia who wanted direct access to SQL too:
Hi guys, I made a very simple test case that selects a LIMIT of 1 until 100 of a few columns out of the Services table. After I finished the "over DBus" version of it, and while I was measuring its performance, I was already confident that as konttori pointed out too, that the DBus overhead truly is minimal when compared to query time. I could write the same test with direct access and it would most likely shave of another few tenths of milliseconds. But for a UI application I don't really see the point in that (a mainloop iteration that has to do a few exposes and draws is likely going to take longer). So I attached a vala app for testing this and I included the generated C source code for it. `pkg-config dbus-glib-1 --cflags --libs` it and for the Vala stuff take a look here: http://live.gnome.org/Vala/DBusSample So this is over DBus: pvanhoof tinc:~/test$ ./test-sql-tracker ... 0.035126 seconds elapsed pvanhoof tinc:~/test$ Now with relation to the queuing: SQLite has a write-lock per transaction and it keeps all tables in the connection involved in the transaction, locked. A write lock means that while we (Tracker) are writing, you (your process, your connection, your *direct* connection to the SQLite tables indeed) is locked out. Tracker writes in long transactions because SQLite is 50 or 60 times faster at writing if you group lots of writes together. If you don't do this you also have the fsync() problem more often. This is a similar problem as what Firefox started having when it switched to SQLite for several things. Short: This means that we can't turn off our use of transactions. It's vital in Tracker's design and performance. For preempting these transactions we require a very strict communication between the indexer and the front end query mechanism. This implies a synchronous DBus message to the indexer. This DBus message will instruct the indexer to do a preemptive commit of its standing transaction. This is possible because the transaction is not done for atomicity but is only done for improving the write speed of SQLite (so we can early commit it, and we do that). So even if we'd make a library that has a direct (in process with the app linking with that library) connection to the SQLite database, then we'd still, just like what trackerd (the front end query mechanism) has to do too, would need to execute a DBus message to request a preemptive commit to the indexer. Making it pointless to do a direct connection in the first place (because you still have the DBus overhead for this one message, anyway). Some pointers: http://www.sqlite.org/faq.html#q5 http://www.sqlite.org/lang_transaction.html What would be possible to avoid queuing, in case a lot of applications concurrently want to query Tracker, is to introduce a connection pool and a queue for each connection in the pool, to Tracker's query front end. I must warn that although very much possible, that this solution adds complexity. If it's not proved that concurrent access will very often occur, I don't think that it's worthwhile right now to implement this solution already. But we can keep it in mind for when the day comes. /EO my advise ;)
On Tue, 2008-10-21 at 08:47 -0400, Jamie McCracken wrote:
On Tue, 2008-10-21 at 09:36 +0100, Martyn Russell wrote:Also in the future i want to support direct access to sqlite via a client lib so we can bypass dbus (and trackerd) for select queries where speed is paramount and volume of data is too big for dbus to handle optimally (think get all my 100,000 music tracks with metadata). So this library would have to handle all querying and any future ones (like sparql) - so you will have no problem from me for implementing that support in a libHmm, I would like to see the difference it makes using DBus and if it really is an issue. We have an API like this in DBus now which Phillip added - I really don't like the idea of people executing random SQL on the databases. It can lead to much bigger problems. Phillip stresses this in the .xml file where we document this API. I think quite rightly so too.that should be moved to a direct access lib the advantage of a direcdt access lib is that it removes dbus overhead when large amounts of data are required. Rob taylor probably knows this better but from what i understand dbus is not optimal for large payloads (~1mb+) and somehting like get all music and metadata might involve a million plus strings which dbus would have to marshall, strdup and validate individually into multiple packets (IIRC packet size is 4kb?) so you are looking at massive overhead with multiple ipc calls _______________________________________________ tracker-list mailing list tracker-list gnome org http://mail.gnome.org/mailman/listinfo/tracker-list
-- Philip Van Hoof, freelance software developer home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org http://pvanhoof.be/blog http://codeminded.be
Attachment:
test-sql-tracker.vala
Description: Text Data