So I think at this point we can call RhythmDB a success, but it wasn't as big a one as I had hoped. There are still issues with locking, although fewer than with RBNode. Also it will need reworking to solve issues like filesystem coherency. So, let's investigate. * Locking One major problem is doing a rhythmdb_write_lock from the main (GTK+) thread. What happens now, when you click on an artist or album (in the library), the main GTK+ thread grabs the a read lock on the db, prepares some data structures for the query (specifically the genre/artist/album bits), and then read unlocks and kicks off a query. The query runs in a separate thread. It first grabs a read lock on the db itself, and starts pushing data into the query model into an async queue. The main thread occasionally polls the query model for updates. Finally, when the query is complete, a special entry is put into the query model update queue, and when the main thread polls, it gets the completion signal and resets the genre/artist/album stuff back to normal. The query thread, meanwhile, is waiting for acknowledgment that the main thread got the completion, and when it gets it, it read unlocks the db and returns. Several problems here. One, is there's a race condition between the read unlock in rb_library_source_do_query and when the db is read locked in query_thread_main. So a write could sneak in there, which would be bad because it would probably corrupt the refcounts on the properties. Secondly, what happens if you try to do something that requires a write lock on the db from the main thread while a query is in progress? An example is creating a new iradio station. That will result in deadlock, since the write lock won't succeed until the db is read unlocked by the query thread, but the query thread won't read unlock until the main thread acknowledges the query completion (or cancels it). A smaller problem, but still annoying is that since the query runs in another thread, at least two context switches are required before you start to see results. Really, the whole thing is just too complicated (you may have guessed I was leading up to this). Let's take a step back here. Why run the query in a separate thread? The only advantage here really is that you'll get a speedup on SMP systems. But how much of one? I was looking at some profiles, and the time spent in the query thread was almost invisible. All the real work is pretty much done in the main thread anyways, like inserting into data structures and redrawing the treeview. The conclusion I've come to though is that if Rhythmbox ever locks up or crashes because of something like this, we have lost all the advantages we gained. Stability is a lot more important than a slight speed boost. I want Rhythmbox to be totally rock solid, something you feel you can always rely on. Some of the crash reports in Bugzilla of course are really GStreamer, but we have a lot of our own. So I want to vastly shrink RhythmDB's use of threads. I now envison just 2 threads (apart from GStreamer). The first is the main GTK+ thread. The second is a thread that just does filesystem work - it will read metadata from files, watch directories for changes, recursively read directories, and the like. Both threads have GAsyncQueues where they receive updates from the other thread. The advantage of this scheme is that we don't need any kind of locking inside the main GTK+ thread. Basically the GDK lock and the db lock are merged. The main thread knows that the database will never change under its feet when it holds the GDK lock. It can also do writes (like create a new iradio station) without any locking. From a periodic idle handler, the main thread will poll the fs thread for changes. It will then perform any actions necessary in the models. How will incremental queries work? I envision this just being an idle handler (not a thread) that walks the db and pushes entries into a model. This makes cancelling a query a whole lot easier than the current dance we go through - with an idle handler, we could just remove it from the glib main loop, and we're done. Making this change shouldn't be incredibly difficult; the hardest part will be turning the threaded query into an idle handler. * Filesystem coherency In summary: http://bugzilla.gnome.org/show_bug.cgi?id=125177 This is also related to: http://bugzilla.gnome.org/show_bug.cgi?id=125452 Here are my thoughts on this. Currently, RhythmDB just stores a set of songs (set in the mathematical sense, so no ordering at all). Each song has a URI associated with it; RhythmDB maintains a bijective mapping between URIs and songs. One issue with this is that RhythmDB has no notion of directories. But it should, because this allows us to make some important optimizations. For example, as I mentioned in the bug, we can just stat() each toplevel directory on startup, and if those haven't changed, we're done. More generally though, we want to monitor directories for updates at runtime as well. So what we really need is for RhythmDB to maintain a specialized kind of indexed filesystem, and to keep it in sync with the real filesystem. The RhythmDB filesystem will never be written to directly. When the user clicks delete on a song, we will simply delete it from the real filesystem; then, the fs thread will eventually notify us of this deletion, because it'll be watching the directory. RhythmDB should primarily store (filesystem, inode) pairs, for both directories and files. That's sufficient for unique identification. These are the "device" and "inode" members of a GnomeVFSFileInfo. Note that storing things this way solves the recursive symlink problem; if during a recursive directory traversal we hit a (filesystem, inode) pair that's already in the DB, we just ignore it. However, this doesn't work for non-local filesystems. So we'll still have to store URIs for those. So I am thinking the structure will look something like this: struct RhythmDBFile { gboolean is_local; gboolean is_dir; dev_t device; /* Both of these are invalid if */ GnomeVFSInodeNumber inode; /* is_local is false */ time_t mtime; char *name; /* UTF-8 */ union { GHashTable *dirents; /* used if this is a dir */ RhythmDBEntry *entry; /* used if this is a song */ } data; }; Now, RhythmDB stores a set of RhythmDBFiles that are the "roots". It also stores a hash table which maps (device, inode) pairs to RhythmDBFiles. This is because updates could come in two forms; our fs monitoring thread will tell us things like (3, 3532523) has been deleted. Then we need to look that up in the hash table, map it to a RhythmDBEntry, and signal the main thread that that entry has been deleted. But we also need to be able to handle plain URIs; as I mentioned for remote filesystems we don't have inodes. So the strategy there will be to try matching the URI against each root in turn. If there's a match, then we recursively traverse the directory pointers, matching up URI components, until we get to a RhythmDBEntry. This should be reasonably efficient. So, thoughts?
This is a digitally signed message part