Re: Proposal: Ref Counting Conventions

From: "Michael K. Fleming" <mikef praxis etla net>
To: gnome-components-list gnome org
Cc: Maciej Stachowiak <mjs eazel com>,Miguel de Icaza <miguel helixcode com>,Michael Fleming <mfleming eazel com>
Subject: Re: Proposal: Ref Counting Conventions
Date: Wed, 17 May 2000 10:33:02 -0700 (PDT)
This is a long post with fragments of the argument clipped together.  Here
is a summary of my position:

1. It seems that people agree about conventions (1) and (3).  Should we
adopt them?

2. Convention (2) deals with inout parameters.  I maintain that convention
(2) correctly distinguishes inout's from "in's" and "out's" and that it
allows callers to have consistent behaviour and not have to do anything
weird when dealing with such parameters.  My argument continues below

3. I'd also agree that "inout" parameters are confusing and often
unnecessary.  Usage of "inout" parameters should be avoided if reasonable.
However, since "inout" exists in CORBA, we should have a convention for
dealing with it.  Actually, the fact that it is rarely used makes my
visual auditing argument below even more poignent.

4. As far as I know, nobody has a silver bullet for tracking down refcount
issues.  I certainly never saw one in the COM world.  The best defense
against refcount bugs is clear, consistent rules so that source code can
be visually audited to see if it follows those rules.

======================
On the subject of refcounting inout parameters:

Miguel writes:
> Proposal:
>
>        inout parameters should be treated just like an in parameter
>        and an out parameter.

Actually, I think that's exactly why convention (2) is so important.  The
convention makes it clear that an "inout" parameter has distinct
properties from either an "in" or an "out"; it is not merely an "in+out".

Miguel writes:
> An alternate convention is required here (saying "varies on the docs"
> is a valid answer).

That's actually exactly what I'm proposing we avoid.  Every Bonobo method
will have to deal with reference counts.  People should not have to
memorize which methods behave in which fashion.  All methods should behave
consistently 

Miguel writes:

> The inout is not part of any spec I can find in my COM books that i
> have here, and I would doubt that they did something as stupid as

It's in /Inside OLE/, and probably elsewhere.  Check here:

http://msdn.microsoft.com/library/default.asp?PP=/library/toc/inole/inole0-2-3.xml&tocPath=inole0-2-3&URL=/library/books/inole/S10F4.HTM

The quote follows: 

> o Functions that accept an interface pointer as an in/out-parameter must 
>   call Release through the in-parameter before overwriting it and must
>   call AddRef through the out-parameter. If the caller wants to maintain
>   a copy of the pointer passed in this parameter, it must call AddRef 
>   through the copy before calling the function.

Miguel writes: 

> Obviously there *might* be cases in which you take an object, and
> return a different one.  But the first question is: how often is this
> going to be the case?  How many samples of this do we have?

The issue is not "how likely is this?" but "can it ever happen?"  If the
answer to the second question is yes, than every user of a function that
has an inout parameter must keep this in mind when using the function.

> Your example is well suited for the case of "Change_or_replace", but
> I sustain it is an uncommon case, and I sustain that it is a breakage
> from the base for ref/unref usage.

But the point is that convention (2) makes it *invisible* to the caller
whether the inout parameter was replaced or merely passed back unmodified.

Miguel writes:
> Unrefing your first argument in this case is obviously broken[1],
> because then every invocation to a method should become (for simetry): 
> 
>        x_ref (a);
>        x_op (a, &ev);
>
> Every single one of them.  Because semantically it is the same thing
> you propose.

I want to stress again that I think an "inout" parameter should not be
seen as merely an "in+out" parameter.  Consider the following example
(excuse my syntax):


interface x : Bonobo::Unknown {
        void swap_object (inout Bonobo::Unknown to_swap);
	void swap_object2 (in Bonobo::Unknown old, out Bonobo::Uknown
                         new)'

        Bonobo::Unknown swap_object3 (in Bonobo::Unknown old)'
}

You might expect to call "swap_object" like this:

[a]    x_swap_object( &arg );

But you could not call swap_object2 like this:

[b]    x_swap_object2( arg, &arg );

Similarly, you would not call swap_object3 like this:

[c]    a = x_swap_object3( arg );

It is clear upon inspection that the above calls [b] and [c] are wrong
because they cause "arg" to be overwritten without the reference being
disposed of.  What is not clear is that the call [a] may be
wrong if "x_swap_object" returns a new object and does not follow
convention (2)!

If "x_swap_object" replaces its inout parameter with a new object and
does not unref() its argument beforehand, then the only correct way to
call it is:

[d]
   tmp = a;
   x_swap_object( &a );
   if( a != tmp ) {
      x_unref( tmp );
   } 

Now, who'd think of that every time?  If you're scanning the code
visually for refcount issues and you do not realize that x_swap_object is
an inout function, then [a] won't look wrong to you, although [b] and [c]
certainly will.

Using inout parameters and not following convention (2) *hides refcount 
bugs*.

Now, I'd be willing to agree (as Maciej brings up) that inout parameters
are ugly and should be avoided.  But since they exist they may be used,
and when they are used they should be used in a consistent manner.  If you
follow convention (2), then the caller does not have to special-case the
handling of inout parameters, the callee does.  Since there are more
callers than callees, this makes sense.

Have I convinced you yet, Miguel?

=======================

On the subject of tracking refcounting bugs:

Miguel writes:

> Not really.  You just need to be able to trace ref/unrefs.  And write
> some tools to manipulate data generated from a debugging run.  Sure,
> not the perfect solution, but much better than a semi-broken
> convention.
> 
> Speaking of which, what is the Microsoft standard for this, is there
> any?


I'm trying to convince you that the conventions I propose are not broken.
I think you agree with conventions (1) and (3), its only convention (2)
that is still up for debate, correct?

I should mention that in my COM experience, I have yet to see a tool that
is effective in tracking down refcount issues in a single process, much
less across networks.  I'm not saying its impossible, but I am saying its
very hard to do and we shouldn't count on such a tool showing up any time
soon.

The two most effective techniques for refcount debugging that I've seen
are: (a) Set a breakpoint when the refcount drops to 0 if you think the
object is being released early and (b) Log a stack trace for each ref()
and unref() and dig through them by hand if you see objects being retained
past their useful lifetime.  I haven't seen anything that automates (b).

================

Maciej says:

> I hope I won't embarass the original poster (Michael Fleming) too much
> by pointing out that due to previous work experience he is in a
> position to know better than either you or the docs what exactly
> Microsoft's conventions for COM refcounting are.

<smile> As long as you don't try to make me admit it in front of RMS, as
Jurgen did last week!

Mike Fleming
(also mfleming@eazel.com)
Follow-Ups:
- Re: Proposal: Ref Counting Conventions
  - From: Miguel de Icaza
References:
- Re: Proposal: Ref Counting Conventions
  - From: Maciej Stachowiak
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]