[xml] libxml2, python, reducing memory consumption?

I have a tool that reads a number of input documents, does some processing and then builds and writes a new document, but it uses about 500MB of RAM when it runs so I am trying to find ways to reduce the memory footprint. I know that using SAX instead of DOM would do it for me, but the nature of what needs to be done just screams for DOM and XPath do that is the way I have done it.

The total size of the input documents is about 100M and each one is about 4-8M. The total size of the output document is about 1.5M.

I added a call to doc.freeDoc after I am done with each input document before loading the next one, but that doesn't seem to affect memory consumption at all.

Then instead of building the whole output tree in memory I tried only building only the dozen or so children of the root element and serializing each to a file and then calling node.freeNode, but there was still no improvement.

What am I missing?

