This only fixes a very minor problem, but I'd thought I'd point it out anyway: The problem is that the hash table in hash.c grows when the length of a chain in the table is over MAX_HASH_LEN. But growing the table does *not* necessarily mean an individual chain in the table will get smaller, because the elements in that chain may still have the same hash value. For example, if we parse the following (very contrived ;) ) document, the hash table will grow to a size of 16384, even though it only has 12 elements in it, and it will continue growing exponentially if we add any more elements to the chain that is already too long. <?xml version="1.0"?> <!DOCTYPE root [ <!ENTITY abcde "abcde"> <!ENTITY abced "abced"> <!ENTITY abdec "abdec"> <!ENTITY abdce "abdce"> <!ENTITY abedc "abedc"> <!ENTITY abecd "abecd"> <!ENTITY acbde "acbde"> <!ENTITY acbed "acbed"> <!ENTITY adbec "adbec"> <!ENTITY adbce "adbce"> <!ENTITY aebdc "aebdc"> <!ENTITY aebcd "aebcd"> ]> <root/> The attached patch should provide more sensible behaviour in this pathological case -- we only grow the hash table if the *average* chain length is greater then MAX_HASH_LEN (perhaps MAX_HASH_LEN should also be reduced if this patch is applied?) Thanks, Anthony
Attachment:
hash.c.patch
Description: Text document