Re: stale IORs in lock directory ...



Michael Meeks <michael ximian com> writes:
> Hi Havoc,
> 
> On 13 Aug 2001, Havoc Pennington wrote:
> > Are you considering that I ping the existing lock holder by invoking
> > the _ping() method on it?
> 
>         There is really not much need for a ping, a simple
> CORBA_Object_non_existant will tell you whether the other end is accepting
> communications - or points at a now defunct server.

CORBA_Object_non_existant() in ORBit1 is implemented as an inefficient
and busted-ass hack IIRC. ;-) It looks different in ORBit2, but I
haven't investigated enough to know exactly what it does and under
what conditions it works vs. under what conditions a oneway method
will fail (i.e. ConfigServer_ping()). Can you explain when it will
make a difference? (i.e. when does the set of cases where ping()
returns a comm failure differ from the set of cases where
non_existent() returns true?)

> > I'm not sure I've figured out which code path you're describing -
> > you're saying that if a server exists and we pass the old one to oaf,
> > it doesn't end up owning the lock? Didn't it already have the lock?
> 
>         Um - well, possibly - but it's quite possible that the server
> mentioned in ~/.gconfd is actualy dead - and it's good to check for that
> before giving it's IOR to oaf and exiting.

I don't understand - under what circumstances will the server in
~/.gconfd be dead, but the ping will not result in an error?

There's certainly a race (server dies between ping and being handed to
OAF), but that race exists for all servers being handed to OAF in all
situations, I would expect. 

> > This whole mess is just a workaround for OAF "losing" gconfd anyhow,
> > so I wish it could go away.
> 
>         It is well worth fixing the bug in the right place IMHO, then we
> might get a fix that is genericaly useful to other programs, and not
> leave an ugly - not understood race / fault in the underlying system -
> which sucks.

Well sure, I would love to see the bug fixed in the right place. But I
don't have time (or understanding of OAF) to do it; it's a pretty
major project to get it reliable. Since I'm the one that gets all the
bugs about duplicate servers, I put in an official Bad Hack to fix it
for now.

Havoc




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]