Re: High-performance Priority Queue / Heap for glib



Ben Pfaff wrote:

> Maik Zumstrull <maik zumstrull rz uni-karlsruhe de> writes:

> The big-O of a Fibonacci heap is better than that of a balanced
> tree or a binary heap, but how is the actual performance?  When
> you benchmark your Fibonacci heap against balanced trees or
> binary heaps for realistic test cases, is there an actual
> improvement in runtime?

To be perfectly honest, I hadn't run any direct benchmarks before and
just assumed a Fibonacci Heap would be faster. I have done so now.

First, the results of my synthetic benchmarks:

Running Test "Insert all, then pop all":
         n      TQueue     FQueue
      5000      0.006s     0.004s
     50000      0.137s     0.070s
    500000      2.202s     1.148s
Running Test "Insert, then pop, randomized":
         n      TQueue     FQueue
      5000      0.006s     0.003s
     50000      0.102s     0.048s
    500000      1.649s     0.803s
Running Test "Insert, change, pop, randomized":
         n      TQueue     FQueue
      5000      0.011s     0.004s
     50000      0.147s     0.055s
    500000      2.455s     0.874s

"FQueue" (FibonacciQueue) is my implementation, "TQueue" (TreeQueue) is
a thin abstraction to provide a priority queue API on top of GSequence.
I looked at GTree first, but it provides no clean way to get the
minimal element of the tree as far as I can see. You'd have to call
g_tree_foreach and abort in the first callback. GSequence seems more
appropriate.

All tests involve inserting n elements with random priorities into
the structure. In the first test, there are n inserts followed by n
pops. In the second test, a random number of inserts is followed by a
random number of pops until n inserts have happened in total, then the
queue is poped until empty. The third test is similar, but between
inserting and poping a random number of elements have their priority
randomly changed in each cycle. Each test is run 10 times, the
numbers listed are averages over that.

Being faster by a factor of about 2-4 is not stellar, but not too
shabby either. Two other things to note: FQueue is much faster for the
randomized tests; this is probably because until you pop, FQueue is
simply a linked list. The structural compression into trees happens
during poping, and would take quite a while if the root list is 500000
elements long at that time. Secondly, FQueue barely cares about
change_priority calls, while TQueue almost doubles its runtime. I guess
that's the difference between O(1) and O(log n) right there.

For a real-world perspective, I forked the graph processing application
I mentioned and switched from FQueue to TQueue. This increases the
runtime for real input data by about 30% -- FQueue is somewhat faster,
though not as much as I expected it to be.

> The reason that I ask is this paragraph in the chapter on
> Fibonacci heaps in Cormen, Leiserson, and Rivest, _Introduction
> to Algorithms_:
> 
>         From a practical point of view, however, the constant
>         factors and programming complexity of Fibonacci heaps
>         make them less desirable than ordinary binary (or k-ary)
>         heaps for most applications.  Thus, Fibonacci heaps are
>         predominantly of theoretical interest.

I'm not sure what "programming complexity" he's talking about. The
forest data structure involves some mildly annoying pointer-fiddling,
but it's not exceptionally challenging. Other than that, he might be
right about the relation between binary- and Fibonacci-heaps. I haven't
tested against the former, my primary goal was to be faster than lists
and trees, and I went with what I thought to be the best algorithm for
the job. Maybe I've let the theory deceive me.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]