[xml] SAX HTML still stuck.
- From: Bill Moseley <moseley hank org>
- To: xml gnome org
- Subject: [xml] SAX HTML still stuck.
- Date: Thu, 04 Oct 2001 10:50:38 -0700
Sorry for the boring and non-sexy questions, but I need help:
The parser is hanging when I try to abort processing.
I'd like to abort SAX parsing mid-document (in this case I'm aborting
right after </title>. I set an abort flag in my user data,
then on the next chunk read and bail out.
Here's the exciting details:
I'm using the Apache core.html documentation page for testing.
(Note, if I grab a copy from apache.org it doesn't hang,
so something in the doc seems to confuse it.)
lwp-download http://hank.org/modules/core.html
Saving to 'core.html'...
119 KB received
gdb ./testlibxml2
GNU gdb 4.18
(gdb) run core.html
Starting program: /data/_g/lii/swish-e/src/./testlibxml2 core.html
*hangs*
Program received signal SIGINT, Interrupt.
0x4006beed in htmlParseTryOrFinish (ctxt=0x804ac60, terminate=1) at HTMLparser.c:4317
4317 if ((avail == 1) && (terminate)) {
(gdb) bt
#0 0x4006beed in htmlParseTryOrFinish (ctxt=0x804ac60, terminate=1) at HTMLparser.c:4317
#1 0x4006c6d3 in htmlParseChunk (ctxt=0x804ac60,
chunk=0xbfffe7fc "CTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final//EN\">\n<HTML>\n<HEAD>\n<TITLE>Apache
Core Features</TI
terminate=1) at HTMLparser.c:4620
#2 0x80487f0 in main (argc=2, argv=0xbffff8b4) at testxmllib2.c:35
(gdb) q
cat testxmllib2.c
#include <stdlib.h>
#include <string.h>
#include <libxml/HTMLparser.h>
static void end_hndl(int *abort, const char *el);
int main(int argc, char **argv) {
htmlSAXHandler SAXHandlerStruct;
htmlSAXHandlerPtr SAXHandler = &SAXHandlerStruct;
int abort = 0;
char buf[4096];
htmlParserCtxtPtr ctxt;
int res;
FILE *f;
memset( SAXHandler, 0, sizeof( htmlSAXHandler ) );
SAXHandler->endElement = (endElementSAXFunc)&end_hndl;
if ( !(f = fopen( argv[1], "r")))
{
printf("Failed to open '%s'\n", argv[1]);
return -1;
}
if ( !(res = fread(buf, 1, 4, f)))
return -1;
ctxt = htmlCreatePushParserCtxt(
SAXHandler, &abort, buf, res, argv[1], 0);
while ( !abort && (res = fread(buf, 1, 2048, f)) > 0)
htmlParseChunk(ctxt, buf, res, 0);
htmlParseChunk(ctxt, buf, 0, 1);
htmlFreeParserCtxt(ctxt);
printf("done!\n");
return 0;
}
static void end_hndl(int *abort, const char *el)
{
if ( strcmp( el, "title") == 0 )
*abort = 1;
}
gcc -o testlibxml2 -g -O2 -Wall -pedantic testxmllib2.c -lxml2
libxml2 2.4.5
gcc -v
Reading specs from /usr/local/lib/gcc-lib/i686-pc-linux-gnu/2.95.3/specs
gcc version 2.95.3 20010315 (release)
Bill Moseley
mailto:moseley hank org
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]