Re: [xml] Better hash function for dict.c



On Wed, Aug 06, 2008 at 04:11:13AM -0400, Daniel Veillard wrote:
On Wed, Aug 06, 2008 at 08:25:14AM +0200, Stefan Behnel wrote:
  Another option I looked at is the 'One-at-a-Time Hash' from
http://burtleburtle.net/bob/hash/doobs.html , looking at the criterias
and the results it looks like a good hash too, not too expensive and 
should work well. I will try to make a patch using this this morning,
if you have a bit of time then, maybe you can rerun your initial tests
with that one, is that possible ?

  I did just that ... I commited in revision 3764 a new version
using 'One-at-a-Time Hash' this passes all my tests. There is a
#define WITH_BIG_KEY

  removing it allows to test the old version without much burden
Hopefully this fixes both the performance issue you had and the problems
of dictionary semantic which was in SVN, and still keeping the fast
hash for small dictionaries.
The nasty thing is that errors in the hash only show up in very weird
and convoluted manners, all the simple things seems to work and you have
to dig out and make guesses to find out what is really happening.

I will add specific regression tests for the dictionary to make sure
we always keep the dictionary garantees even if the size grows or if
we use subdicts. So i'm not fully done with this issue but there is
something out for testing !

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/

Attachment: new_big_hash.patch
Description: Text document



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]