On Wed, Aug 06, 2008 at 04:11:13AM -0400, Daniel Veillard wrote:
On Wed, Aug 06, 2008 at 08:25:14AM +0200, Stefan Behnel wrote: Another option I looked at is the 'One-at-a-Time Hash' from http://burtleburtle.net/bob/hash/doobs.html , looking at the criterias and the results it looks like a good hash too, not too expensive and should work well. I will try to make a patch using this this morning, if you have a bit of time then, maybe you can rerun your initial tests with that one, is that possible ?
I did just that ... I commited in revision 3764 a new version using 'One-at-a-Time Hash' this passes all my tests. There is a #define WITH_BIG_KEY removing it allows to test the old version without much burden Hopefully this fixes both the performance issue you had and the problems of dictionary semantic which was in SVN, and still keeping the fast hash for small dictionaries. The nasty thing is that errors in the hash only show up in very weird and convoluted manners, all the simple things seems to work and you have to dig out and make guesses to find out what is really happening. I will add specific regression tests for the dictionary to make sure we always keep the dictionary garantees even if the size grows or if we use subdicts. So i'm not fully done with this issue but there is something out for testing ! Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
Attachment:
new_big_hash.patch
Description: Text document