GLib new user + Quarks and Hash Tables



  I am currently working on a research compiler and simulator for computer
architecture (i.e. design your own novel hardware feature, compile for it,
simulate, measure benefit).  I have very recently become interested in GLib
and have thought of replacing parts my current home-grown library with it if
it makes sense to do so.  My first task was the replacement of our symbol
table.

The hash table/symbol table I currently use maps a string (or any type) to a
pointer (string->pointer) just like GLib's hash table.  However, there are
forms of the add and find functions which return a unique identifier (the
symbol) much like your quarks.  This unique identifier is used in other
higher level structures to quickly return either the string or the pointer.
It is also used for equality/inequality comparisons with other symbols.
This unique identifier (or symbol) is currently implemented as a pointer to
the hash table element, but this is hidden from the user with functions
implemented very simply as:  (pseudo code here)

get_pointer_from_symbol(symbol) {symbol->pointer}

and

get_string_from_symbol(symbol) {symbol->string}

Alternatively, the unique identifier could be an int (or quark) and these
function could return the results of an array lookup, etc.

As I tried to port my compiler's symbol table to GLib I ran into a small
concern.  Basically, I need to use quark to map a string to a unique int
(which I can use as a symbol and compare in the higher level structures) and
a hash table to map the unique int to the pointer.

The inefficiency with this in GLib is that it utilizes two hash tables.  In
addition to space concerns, two lookups are necessary to go from a string to
pointer.  My question is:

Is there a better way to do this in GLib?

If not:
Can we augment the hash table to return a unique symbol (polymorphic
quark-like item) for some versions of add, find etc?

These unique symbols are essentially polymorphic quarks which might be a
nice thing to offer anyway.   This change would also make the implementation
of string quarks simpler, faster, and more space efficient.   (Quarks would
basically be a hash table of strings->NULL.  The quark would be the
symbol/pointer to the hash table element.  The string would be accessible by
symbol by a single dereference.  The pointer to NULL would basically be
ignored.  No array necessary.)

Thank you in advance for your help.

David

--
Prof. David August                        Department of Computer Science
august@cs.princeton.edu                   Princeton University
http://www.cs.princeton.edu/~august       (609) 258-2085





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]