High-performance Priority Queue / Heap for glib

From: Maik Zumstrull <maik zumstrull rz uni-karlsruhe de>
To: gtk-devel-list gnome org
Subject: High-performance Priority Queue / Heap for glib
Date: Tue, 3 Mar 2009 16:45:10 +0100
Needing a fast priority queue for an internal project recently and
noticing that glib doesn't have one, I wrote it myself. I'd be
interested in submitting the code, and I'd be willing to do the
necessary cleanup and documentation work on it.

It's a pretty straight-forward implementation of a Fibonacci Heap
<http://en.wikipedia.org/wiki/Fibonacci_heap>, with a forest of trees
of circular double-linked lists as the underlying data structure (which
is not as nasty as that sounds).

According to my quick mailing list search, a Heap-based priority queue
was submitted in 2003, but not accepted. Let me address the comments in
<http://mail.gnome.org/archives/gtk-devel-list/2003-March/msg00006.html>.

| - Does it offer significant (big-o) performance 
|   benefits for a common operation, or allow some 
|   operation that would be really hard to code without 
|   the data structure?

All operations can be done with existing glib data structures, but
there is a significant performance benefit. Please refer to
<http://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times>.

Glib includes linked lists and balanced trees, of course; a simple heap
was submitted in 2003. As you can see, all three structures are easily
beaten by this approach, dropping performance to O(1) for most
operations and retaining O(log n) performance in the other cases.

| - Would it be useful to a significant fraction of
|   applications?

I don't think I can estimate the need. Personally, I was working with
graph algorithms related to partitioning and path finding, where a good
priority queue has a significant impact. In other areas where a
priority queue is needed, presumably people are simply using glib's
trees or lists now, accepting the performance penalty, which wouldn't
matter in most applications.


My implementation, ~300 lines of code, is fairly self-contained. It
introduces one new data structure (to be renamed for inclusion):

struct s_FQueue {
        gpointer data;
        gint priority;

        gint degree;
        gboolean marked;

        struct s_FQueue *parent;
        struct s_FQueue *prev;
        struct s_FQueue *next;
        struct s_FQueue *firstchild;
};

typedef struct s_FQueue FQueue;

The glib slice allocator is used to create and drop these elements.
There are no significant dependencies on glib functionality otherwise.

Memory management follows the semantics used by glib's linked lists: a
simple NULL pointer represents an empty queue, and FQueue elements are
created and deleted implicitly as items are added to and removed from
the queue. The implementation doesn't at all care about the data
pointer, so filling it with GINT_TO_POINTER wrapped gints instead of
actual pointers is okay.

The element's priorities have to be expressed as gints, the use of a
comparison function is not supported as an alternative. Elements are
returned by the queue in ascending order of the priority value. No
guarantees are made regarding the iteration order for elements with the
same priority. The priority of an entry can be changed after insertion
by a function call, but the priority field must not be directly written.

The user is expected to keep one FQueue *q variable for the queue, then
pass its address to most of the API calls. The pointer is rewritten by
each call as necessary to always point to the FQueue element currently
at the front of the queue (or NULL for an empty queue). The GList style
syntax q = some_api_call(q, ...); for rewriting the pointer is not used
because of the special needs of the insert call (see below). This could
be inverted to use GList-style syntax everywhere, and an FQueue** in
the insert call only.

This is the proposed / currently implemented API:

#define fqueue_isempty(fq) (gboolean)(fq == NULL)

Obvious.

FQueue* fqueue_insert (FQueue **fq, gpointer data, gint priority);

Inserts "data" into the list with a priority value of "priority". An
FQueue element is implicitly allocated. The fq pointer is rewritten to
point to the front of the queue. The *returned* pointer points to the
newly allocated element. Do NOT put that into your queue variable, but
keep the pointer around somewhere if you intend to make
change_priority or delete calls on this element. The returned pointer
can be ignored if you don't plan to change any priorities and don't
plan to remove elements other than by pop()-ing them.

gpointer fqueue_peek (FQueue *fq);

Returns the value of the data pointer of the element with the lowest
priority value, without removing it from the queue. NULL is returned
for an empty queue, so if you need to differentiate between an empty
queue and an element with a NULL data pointer, check for that first.

gpointer fqueue_pop (FQueue **fq);

Returns the value of the data pointer of the element with the lowest
priority value, removing it from the queue. The FQueue element is
implicitly deallocated. Do not attempt to make change_priority or delete
calls on that element after this, it's gone! NULL is returned for an
empty queue.

void fqueue_delete (FQueue **fq, FQueue *elem);

Deletes the queue element pointed to by elem from the queue, rewriting
the main queue pointer if necessary.

void fqueue_change_priority (FQueue **fq, FQueue *elem, gint newpriority);

Changes the priority of the queue element pointed to by elem to
newpriority, rewriting the main queue pointer if necessary.

void fqueue_merge (FQueue **fqa, FQueue **fqb);

Merges two priority queues into one. fqb will be NULL after this call
and fqa contains one queue with all elements of both queues. No memory
needs to be allocated or deallocated, neither explicitly nor implicitly.

void fqueue_free (FQueue *fq);

Rarely needed. If you fill up a queue and then pop() all the elements,
all the memory will be deallocated already. If, for some reason, you
want to get rid of a queue before it's empty, this call is for you. Of
course, calling this on an already empty queue doesn't hurt, it's just
unnecessary.


Any comments? If you're at all interested, what exactly would be
expected of the code for inclusion?
Follow-Ups:
- Re: High-performance Priority Queue / Heap for glib
  - From: Ben Pfaff
- Re: High-performance Priority Queue / Heap for glib
  - From: Behdad Esfahbod
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]