dead nfs mount blocks opening other folders



Hi All,

I noticed a strange Nautilus behaviour and after spending a couple of days in the depths of libnautilus-private/nautilus-directory-async.c I think I found a bug. I talked about it briefly to Alex on irc and he asked me to write it down in an email and he would give a second opinion later so here it is.

The test OS is Solaris.

I have a dead nfs mount on my Solaris machine, it points to a machine that long since has been disconnected. I noticed that if I open the folder that contains this mount then I can not open any other folder for a while (double-clicking does not do anything apart from opening up the "Do you want to cancel ..." dialog.

So I have /brokennfsmount as the dead nfs mount point. When I click on /, it opens and lists the content. Then I click on say /aaa and it doesn't open and there is no feedback about what is going on. Also if I zoom in /brokennfsmount has "? items" displayed (all other elements have valid numbers of items).

I looked at the source and I think the problem lies in start_or_stop_io() in libnautilus-private/nautilus-directory-async.c.

The call stack is roughly that when I double-click on an icon nautilus_directory_call_when_ready_internal() is called, this calls nautilus_directory_async_state_changed() and this calls start_or_stop_io().

The real processing (figuring out info like depth count, number of items in the folder etc) is done in start_or_stop_io(). There are three queues here (called high priority, low priority and extension queues). nautilus_directory_call_when_ready_internal() adds the selected file or all the contents of the selected folder to the high priority queue. In start_or_stop_io() the queue elements are processed in the high priority queue, then moved to the end of the low priority queue, then processed in the low priority queue, then moved to the end of the extension queue and processed there.

All this info gathering is asyncronous done with gnome-vfs async callbacks. These callbacks will satisfy conditions (gathering various info) that are needed to move items down the different queues.

And here lies the problem. The queues are processed from the front and new items are added to the end. An item at the front of a queue can block the processing of all items behind it. If my nfs folder does not come back from the async call for a long time then all other items behind it will not be processed. So when I click on "/", all elements from "/" go into the high priority queue, get processed, then moved to the low priority queue and so on. /brokennfsmount does not move out from the low priority queue for a very long time. The reason is that directory_count_start() takes a very long time to run. This function registers a callback through gnome_vfs_async_load_directory() to do the counting and it takes a very long time for this callback to be called.

If I try to cd into /brokennfsmount or ls the contents it takes a long time to get the command prompt back so this is not a gnome-vfs specific problem.

I think that in Nautilus we should make sure that situations like this can not block processing other folders as opposed to changing things in gnome-vfs.

I don't have any obvious solution in mind yet (I don't think I know the code well enough for this). This is mostly for Alex and other interested people to understand the problem and maybe to provide some feedback (if they know this code) as to do they agree or not that what I described is a problem that needs to be fixed.

I would appreciate any feedback provided.

Thanks,

Laszlo



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]