Re: [RFC/PATCH] Nonotify - A simplistic way to determine directory content changes



On Sun, 2004-06-06 at 03:29, nf wrote:
> On Sat, 2004-06-05 at 17:14, John McCutchan wrote:
> > On Sat, 2004-06-05 at 10:10, nf wrote:
> > > Interesting point! But is it true? I actually tried to explore this
> > > yesterday by brute-force attacking a mounted dir with stat-calls, but
> > > unmounting never was blocked. It seems that reading the inode-attributes
> > > is "almost" atomic. 
> > > 
> > > I don't know enough about the spin_lock() function. I thought it
> > > synchronizes access to a certain resource, rather than throwing an
> > > error. Can you show me the place in the kernel, where the umount
> > > blocking via "stat" would happen?
> > > 
> > 
> > Of course its true, the kernel would never access the filesystem with
> > out making sure that it wouldn't disappear mid-access. Check out
> > vfs_stat() in fs/stat.c user_path_walk pins dentries as it walks the
> > path you pass it, then it pins down the inode while it gets its
> > attributes in to the stat structure. Like I said it is a race that is
> > probably going to be hard to trigger, but it can happen. This will make
> > it much harder/impossible to provide deterministic behavior to the user.
> 
> You are right. Stat() calls invoke dget() and dput() on dentries - thus
> affecting usage counts. 
>
> Although there seems to be lots of synchronization involved (spin_locks)
> also - hence there might be a chance that umount will pause and wait for
> the stat() call to finish. I really don't know. I always thought that
> stat() does not block umount - and that the umount problem is only
> caused by open fds.

It can happen, check out vfs_stat(), user_path_walk() returns an owned
vfsmount in nd->mnt, which is not released (mntput) until
path_release(). During that time there is no spinlock held, so on an SMP
machine we could easily race. I guess its harder on a UP kernel, but it
might happen in the pathname resolver, if we need to block on reading 
some directory inode and the unmount process was scheduled before we got
back to the stat process.

> There would be two ways to work around this problem:
> 
> * Pragmatic but not pretty: Repeat umount if it fails.
> 
> * Better: Find a way to synchronize nonotify_stat and umount (don't let
> them happen at the same time).

You will (rightfully) never get the kernel developers to complicate the
locking and synchronization for this case, as it will slow down most
common cases. Maybe the pragmatic approach is more likely to work.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a war-weary sweet-toothed gangster haunted by memories of 'Nam. She's a 
blind African-American Valkyrie with a knack for trouble. They fight crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]