Re: [Nautilus-list] branch the branch?



On Tuesday, August 21, 2001, at 12:43  AM, Maciej Stachowiak wrote:

Darin, any opinions?

My personal preference is to get this all on HEAD as soon as possible. Using a branch to the branch sounds like a fine way to keep things going in the mean time.

The changes are getting big because my attempts to improve performance
are turning into a game of whack-a-mole - fix one thing and another
pops up really high on the profile.

That's exciting to hear. That makes me confident that it's real productive work, since optimization usually works like that. I'm so excited to hear about this.

The one I'm currently workign on is N^2 behavior in
nautilus-directory-async.c - I think to get good performance on large
directories it needs a queue of files that are missing at least one
piece of info instead of scanning the whole file list each time.

I knew that this might come up eventually. When you start on this change, you might want to take an extra moment consider exactly what data structures to use once you go beyond the simple-minded ones.

Here are some thoughts that you might find useful, but are free to ignore:

For example, if we didn't mind making all NautilusFile objects bigger, we could put a link to the location in the "files that need some I/O" queue inside the NautilusFile, which would make removal fast. But if we start going that direction, then we might want to use an intrusive circular list instead of using GList at all.

If we want to avoid making the Nautilus objects bigger, then a hash table might be the right choice because it's not O(n) to remove items, but on the other hand, g_list_remove might be really fast even if it is O(n). The expensive operation is probably evaluating the state of each NautilusFile object over and over again to decide what work is needed.

Another issue is the ordering of operations in nautilus-directory-async.c.
Currently it's a bit randomish, but if we want reliable behavior we might want to consider the algorithm for what work to do first. Implementing that algorithm might lead to a specific kind of data structure. One data structure that keeps an ordering, but doesn't have O(n) remove is a GList and a GHashTable with entries that point to the list nodes. That's actually the kind of data structure used for the file list, although the hash table keys are file names, not file object pointers.

Another way to go at this is to keep a separate queue of objects that need each type of work. The "changed" function would reevaluate whether the object belongs in each queu. So if 100 objects need file info, but only one needs a directory MIME type scan, then there would be two separate lists instead of one list with 101 objects.

    -- Darin

PS: I'm going to write some documentation about I/O in Nautilus today, as we discussed, to try to help future hackers who need to work in this area.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]