# Proposal for a new ghash table

• From: "Emmanuel DELOGET" <logout free fr>
• To: <gtk-devel-list redhat com>
• Subject: Proposal for a new ghash table
• Date: Fri, 26 Nov 1999 12:54:12 +0100

 Hi everibody.   I've just looked at the code of the current ghash (i was wondering what was the underlying code for the hash table resizing). Here is another approach for a dynamic hash table, based upon an article or Per-Ake (Paul) Larson (can't find the reference of this article). I will code it in the next days, and will post results here.   The Larson's algorithm does not introduce any freeze/thaw statements. It increases the size of the hash table in order to limit the size of the hash node list. Basically, it starts with a 2^n size. When the number of hash node is bigger than its current size, the size of the hash table grows to 2^(n+1). This happens on the insertion of a new hash node. then it computes the new hash key and insert the node at the right place.   The key problem is the hash table lookup. Since the size of the table grows and the since the nodes are not moved in the table (well... you can move the nodes too, see later), it is possible to not find the node on the very first look up. Consider a table with a base size of 4 that has gown to have have a size of 32. The hash val of the first element you have added was 123456. In a table of 4 nodes, the resulting hash val is 0. In a table of 32 nodes, the resulting hash val is 8. In order to find the correct key for a particular node, you'll have to     1) compute the current hash value V from the         key K (store it as V1)     2) if the node is found, return it.     2) if not, find the greater n value defined by        2^n < V     4) compute the new K value     5) go to step 2 Before returning, a good thing should be to move the node to position V1 if V != V1. The next time you'll want to access to the node, the look up will be faster.   Of course, an extra thaw function can be provided to reorganize the entire hash table for faster look up when all the elements have been added to the table...   The table is not collapsed when the number of node decreases.   [the following assumes that the base size is 1 - this is only assumed to show the algorithmes]   In order to minimize the allocations, the table is not entirely reallocated when its size changes. you simply use an array of 32 tables (looks like)   g_hash_node **hash_tables;   The insertion code looks like (pseudo-code):   node = create node(key, value); if (nbnodes > current size) {   current_array++;   hash_tables[current_array] =     allocate new table(current_size);   current_size *= 2; } hash_val = compute_val(node, key) % current_size; cur_size = current_size / 2; cur_array = current_array; while (cur_size > hash_val) {   cur_array--;   cur_size /= 2; } add_node_to_array(node,                             hash_tables[cur_array],                             hash_val - cur_size);   The look up code looks like :   node = NULL; cur_loop = 0; cur_size = curent_size; V = V1 = compute_val(node, key) % current_size; while (cur_loop <= current_array) {   if (node is found at pos V) break;   n = (int)log2(V);   cur_size = 2^n;   V = compute_val(node, key) % cur_size;   cur_loop++; } if (node) {   if (V1 != V) move_node_from_to(node, V, V1); } return (node);   To be found at position V in the whole table, a node should be at pos (V - cur_size) in table hash_tables[log2(cur_size)].   Hope this is of interest.   Yours,   Emmanuel