Re: glib hash memory footprint
- From: Davi de Castro Reis <davicastro terra com br>
- To: Keith Packard <keithp keithp com>
- Cc: gtk-devel-list gnome org, Sander Vesik <sander_traveling yahoo co uk>, "Alan M. Evans" <ame1 extratech com>
- Subject: Re: glib hash memory footprint
- Date: Tue, 31 Aug 2004 23:49:10 -0300
Hi all,
First of all, thank you for all the feedback. I think most of you can go
on for a long time about the advantages and disadvantages of different
hash implementations. But probably everybody agrees that there is not a
single approach that is good for every case.
Let me discuss some of the options and give my opinion on how we could
get this done. There are several options:
1. Current linked list implementation: this is simple, readable and
gives good performance, but has a huge memory penalty
2. Dynamic arrays: probably better than linked list, but I don't think
it is worth it. First, you still have a memory penalty for the size of
the arrays of for a array terminator. Furthermore, allocating memory
blocks at will might result in a very fragmented and slow to free memory.
3. Linear probing: this is what I want. It is simpler than double
hashing and could be easily changed for quadric probing. Memory overhead
is very small, and if you can tune your hash function and expose the
g_hash_resize() call to "close" the hash, you can get very good results.
By "close" I mean doing mostly (exclusively) lookup operations.
4. Double hashing: another good option that someone might want to give a
try.
Since there are too many options and too many tradeoffs/scenarios, my
suggestion would be to go with a polymorphic implementation using a
union. So you create your hash using g_hash_table_new (linked list),
g_hash_table_linear_probing_new(), g_hash_double_hashing_new(),
g_hash_array_new() and so on. Then you simply feed the returned pointers
to the common methods (ex: g_hash_lookup()) and let the first byte of
the union demultiplex the correct call.
What I want to implement is the linear probing approach, but I can leave
the hooks for those that want others approches. The advantage is that
running code will not be affected at all. Do you like this idea?
Thanks,
Davi de Castro Reis
ps. I think that the problem of reserving UNUSED/UNITIALIZED values for
the linear probing approach, letting NULL->NULL entries can easily be
solved using and if for those reserved values.
[Date Prev][
Date Next] [Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]