Re: [xml] =?iso-8859-1?q?=BFleak_using_libxml2_for_sax-parsing_html_i?= =?iso-8859-1?q?n_python=3F?=

On Tue, Jan 09, 2007 at 11:40:40AM +0100, Cesar Ortiz wrote:
I´ve been looking at the python bindings and I think I have seen
We´ll I have to understand how it exactly works... as I don´t know what is
every file for.

The thing is that I guess that the function that needs to be called to free
the context is htmlFreeParserCtxt.

  that's not a problem the structures are the same basically:

 * htmlFreeParserCtxt:
 * @ctxt:  an HTML parser context
 * Free all the memory used by a parser context. However the parsed
 * document in ctxt->myDoc is not freed.

htmlFreeParserCtxt(htmlParserCtxtPtr ctxt)

But in the context returned by htmlCreatePushParser (class parserCtxt) in
the __del__ method the function that gets called is xmlFreeParserCtxt.

  which is okay

So, it is not true that the developer has to free the resources

  I said that the unit of allocation is the *document* . If you get a document
you need to free it. Otherwise cleanup should be automatic in Python. And with
SAX you never got a document.

 (no python
style), because when you assign None to the context a 'freeresources'
function is called.
Furthemore,  it looks that the wrong funcion is called.

  To me does not leak memory allocated by libxml2.
Maybe there is a leak, maybe not. Your test case relies on all the .html
and .htm present in your directory somewhere and possibly other things like
your version of python, of the bindings, and of libxml2.
To get back to something debuggable and I can work on, you need to follow
what I said, i.e. get back to the simple case showing a leak when the execution
stops at the end of the script and its output:

if libxml2.debugMemory(1) == 0:
    print "OK"
    print "Memory leak %d bytes" % (libxml2.debugMemory(1))

You can chase that bug on your way too, but sorry, then I can't help, except
by reviewing a patch if you can suggest one in the end.


Red Hat Virtualization group
Daniel Veillard      | virtualization library
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]