Bug #65560 (fcntl() locks not working in NFS directories)



[bugzilla is not cooperating with me, so sent to the list; please Cc
 me, as I am not subscribed.]

Bug 65560 (the need for locking in the ~/.gconfd/lock and
~/.gconf/*-backend-lock directories) was recently marked NOTABUG. I
respectfully disagree.

Consider this from the user's point of view (which is exactly the point
of view I just had to consider it from, just after telling someone else
how much galeon and gconf rocked, and watching it not work for him
because of this bug :( )

- The user probably doesn't know that gconf exists; from the user's
  POV, he's running galeon, or evolution, or the (GNOME 2) panel.
  Thus, errors from gconf must be *really* clearly reported, or
  they'll reflect poorly upon the application that's starting gconf.
  (The user in this case said that `galeon seems to be broken' and
  `it needs better error checking'... and from a user's point of
  view, this is the case.)

- gconf doesn't report its errors to the starting code correctly;
  e.g., galeon hits you with an amazing flood of

,----
| Failed to spawn the config server (gconfd): Failed to contact
| configuration server (a likely cause of this is that you have an
| existing configuration server (gconfd) running, but it isn't reachable
| from here - if you're logged in from two machines at once, you may
| need to enable TCP networking for ORBit)
`----

  messages; thousands of them in quick succession, flooding the logs
  :(

Those errors are misleading in this situation, as there is no problem
with an unreachable gconfd at all --- and you can't even get decent
debugging logs without rebuilding gconf, because to do that you need to
be able to send gconfd a SIGUSR1, and if it dies immediately you can't
do that.

It took me an hour of poring through strace dumps and the gconf source
code before I comprehended what was going on and that not one but *two*
files needed to be rendered lockable (and maybe more than that if I was
using multiple backends) --- it shouldn't be this hard!


At least the error message ought to be improved; at best, the lock files
ought to be created in /var/lock, /var/run or /tmp; somewhere where
transient lock files are conventionally created on Unix platforms, and
which are thus always local and always permit fcntl() locks to be
established upon them.

More, this eliminates the other bug whereby a crashing system can
leave stale lockfiles around; /var/{lock,run} and /tmp get zapped at
startup for exactly this reason.


If nobody can think of a reason why the lockfiles shouldn't go in
/var/lock or thereabouts, I'll hack it up and submit a patch (trying
/var/lock/gconf-{gconfd-pid}/, then /var/run/gconf-{gconfd-pid}/, then
/tmp/gconf-{gconfd-pid}/ in order). But as it is, this bans users whose
home directories are NFS-served from Linux 2.0 and most 2.2 systems (and
maybe others) from running evolution, galeon, gconf, GNOME 2's panel and
all other gconf-using applications --- while other users on the same
network, with $HOMEs mounted from a different place, may be able to run
them without difficulty. I think the suboptimality of this is obvious :)

(yes, there is a make-symlinks workaround. It's hardly ideal. :) )

-- 
`The situation is completely under control. All of them were killed.'
     --- Alim Razim, for the Northern Alliance, demonstrating fine
         command of traditional Afghan prisoner control techniques.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]