Re: [xml] SAX HTML still stuck.



On Thu, Oct 04, 2001 at 12:43:37PM -0700, Bill Moseley wrote:
At 02:30 PM 10/04/01 -0400, Daniel Veillard wrote:
[...] 
    ctxt = htmlCreatePushParserCtxt(
        SAXHandler, &abort, buf, res, argv[1], 0);

    while ( !abort && (res = fread(buf, 1, 2048, f)) > 0)
        htmlParseChunk(ctxt, buf, res, 0);

    htmlParseChunk(ctxt, buf, 0, 1);

 why are you calling htmlParseChunk(ctxt, buf, 0, 1); if you have
aborted ??? 

Sorry I'm so damn dumb.

  Sorry, I'm a bit under pressure too ...

All I'm doing is setting a flag that says there's no more input.  Just as
if the file was truncated.  How do I know *not* to call
htmlParseChunk(ctxt, buf, 0, 1)?  I'm at the end of the file, there's no
more input, that's all I know.

  if (!abort)
      htmlParseChunk(ctxt, buf, 0, 1);

I tried to follow the example in testHTML.c, (e.g. where it calls
endElementDebug() ), and James Henstridge's examples, and
http://www.xmlsoft.org/#interface for the push parser, and the docs.  I'm
not trying to build a tree, so I'm not following why you are referring to
DOM.  I'm trying to use the push interface with SAX:

  I was wrong, I read your code and misunderstood it
I read it as if your SAX handler was identical to the default one but
with just endElement changed, and made a wrong analysis.

I don't mind boring and non-sexy answers -- I just need help understanding
what exactly is wrong with my example code, as I think I'm following the
examples and docs as best I can understand them.  If I'm not, the what I
need is a clear explanation of why, and perhaps a pointer to more info.

  It's clear that there is yet another subtle bug in the push code,
check first that 
   if (!abort)
       htmlParseChunk(ctxt, buf, 0, 1);
is a workaround to your problem.

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]