Re: g_str_hash
- From: Tim Janik <timj gtk org>
- To: gtk-devel-list redhat com
- cc: kenelson sequoia ece ucdavis edu
- Subject: Re: g_str_hash
- Date: Sat, 12 Feb 2000 01:28:00 +0100 (CET)
On Fri, 11 Feb 2000, Karl Nelson wrote:
>
> > > This is the exact same hash except operations have been moved to
> > > enhance speed. The difference being the new one keeps the upper bits
> > > and thus has less collisions and operates somewhere between the TCL and
> > > GLib on.
> >
> > I suspect the appeal of the ASU one is solely that it is nice and
> > easily analyzable. I won't believe that it creates less collisions
> > until somebody shows me a set of (plausible) strings where that is the case.
>
> Well, I have some formulations which should have much better collision
> properties. (This took several hours, so please give it more than
> a token glance.)
>
> This hash is based on multiplying by 31. (and is approx the same speed
> as TCL.)
>
> guint hash (gconstpointer v)
> {
> const char *p;
> guint h=0;
> for(p = (const char*) v; *p != '\0'; p += 1) {
> h = ( h << 5 ) - h + *p;
> }
> return h;
> }
> Short word test: (3-16 char)
> ----------------------------
> (with 24474 words, most short)
> With Glib current: 86 hits
> With TCL: 250 hits
> With X31: 1 hits
> Long test: (20-40 char, vary in tail)
> ----------------------------------------
> Using the 2040 longest words in dict and then pairing them
> with every other word we get 4330561 phrases. (we have a lot
> more phases so the hit ratio is expected to be a lot higher.)
>
> Glib: 220793
> TCL: 132501
> X31: 131478
ok, that's good enough a proof, and thanks for actually bothering
to cook up a test case (something everyone else has fallen short on ;)
the only think left to note is that you can spare the first
hash step, since 0 needs no propagation. so we get:
guint
g_str_hash (gconstpointer key)
{
const gchar *p = key;
guint h = *p;
for (p = key + 1; *p != '\0'; p += 1)
h = (h << 5) - h + *p;
return h;
}
any further objections?
---
ciaoTJ
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]