Re: [xml-bindings]HTML parser segfaults
- From: Daniel Veillard <veillard redhat com>
- To: Gary Benson <gary inauspicious org>
- Cc: xml-bindings gnome org
- Subject: Re: [xml-bindings]HTML parser segfaults
- Date: Mon, 13 May 2002 06:31:05 -0400
On Sat, May 11, 2002 at 02:08:32PM +0100, Gary Benson wrote:
>
> On Fri, 10 May 2002, Gary Benson wrote:
>
> > I've been getting segfaults when trying to write a SAX parser for HTML. It
> > looks like libxml2.htmlCreatePushParser works correctly but the first time
> > you call libxml2.htmlParseChunk it will segv because, in C land,
> > ctxt->input (and possibly ctxt) has been trashed somewhere. I got a little
> > lost trying to debug it as python seems to be doing some wierd threading
> > stuff (and I'm very tired now because of it :-/), so I thought I'd post it
> > here and see if anyone else can find it whilst I sleep ;)
>
> Hmmm, so I gave up on gdb and resorted to my old family favourite printf
yup debugging in the stubs can become a bit ... messy.
> debugging and eventually found it: libxml2.htmlParseChunk was passing the
> python parserCtxt object to libxml2mod.htmlParseChunk rather than the C
> parserCtxt object. The attached patch to libxml2.py fixes the problem
> (though the generator needs fixing really) and the attached script, a
> modification of pushSAX.py, exercises the problem.
>
> Cheers,
> Gary
>
> [ gary inauspicious org ][ GnuPG 85A8F78B ][ http://inauspicious.org/ ]
> --- python/libxml2.py~ Sat May 11 13:20:10 2002
> +++ python/libxml2.py Sat May 11 13:39:53 2002
> @@ -335,7 +335,7 @@
>
> def htmlParseChunk(ctxt, chunk, size, terminate):
> """Parse a Chunk of memory"""
> - ret = libxml2mod.htmlParseChunk(ctxt, chunk, size, terminate)
> + ret = libxml2mod.htmlParseChunk(ctxt._o, chunk, size, terminate)
> return ret
Hum, that code is generated ... this seems related to a problem in the
stub generator code which doesn't handle htmlParserCtxtPtr as xmlParserCtxtPtr
arguments, I fixed generator.py and now libxml2.py contains the proper
libxml2mod.htmlParseChunk() call, It actually then becomes a method of
the parserCtxt class in that case, I tried to fix the example bu now I get
paphio:~/XML/python/tests -> ./pushSAXhtml.py
Error got: startDocument:startElement html None:startElement body None:startElement foo {'url': 'tst'}:error: Tag foo invalid
:characters: bar:endElement foo:endElement body:endElement html:endDocument:
Exprected: startDocument:startElement foo {'url': 'tst'}:characters: bar:endElement foo:endDocument:
paphio:~/XML/python/tests ->
I will commit anyway an fix this later,
thanks for pointing this !
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]