Re: [xml] How to reset an HTML push parser context?
- From: Stefan Behnel <stefan_ml behnel de>
- To: veillard redhat com
- Cc: xml gnome org
- Subject: Re: [xml] How to reset an HTML push parser context?
- Date: Tue, 11 Sep 2007 13:26:30 +0200
Daniel Veillard wrote:
On Mon, Sep 10, 2007 at 09:45:10AM +0200, Stefan Behnel wrote:
Hi,
there isn't currently an API function for resetting a push parser context for
the HTML parser. However, resetting it for reuse doesn't seem to be trivial.
It looks like I have to run htmlCtxtReset() and then create and set up an
input stream (in a pretty ugly way, according to the Create code...). This
could well motivate an official function.
I also thought about using the xmlCtxtResetPush function, but then I stumble
over things like the spaceTab setup (which is currently a sure crasher for me).
Is there anything else I have to do to implement this functionality by hand?
And: is there an easier way?
Honnestly I don't know. I don't see why xmlCtxtResetPush() would not
work for an html parser context.
In case others are interested, the code below works for me (Pyrex code, but
should be readable).
Stefan
cdef int _htmlCtxtResetPush(xmlparser.xmlParserCtxt* c_ctxt,
char* c_data, int buffer_len,
char* c_encoding, int parse_options) except -1:
# libxml2 crashes if spaceTab is not initialised
if _LIBXML_VERSION_INT < 20629 and c_ctxt.spaceTab is NULL:
c_ctxt.spaceTab = <int*>tree.xmlMalloc(10 * sizeof(int))
if c_ctxt.spaceTab is NULL:
python.PyErr_NoMemory()
c_ctxt.spaceMax = 10
# libxml2 lacks an HTML push parser setup function
error = xmlparser.xmlCtxtResetPush(c_ctxt, NULL, 0, NULL, c_encoding)
if error:
return error
# fix libxml2 setup for HTML
c_ctxt.progressive = 1
c_ctxt.html = 1
htmlparser.htmlCtxtUseOptions(c_ctxt, parse_options)
if c_data is not NULL and buffer_len > 0:
return htmlparser.htmlParseChunk(c_ctxt, c_data, buffer_len, 0)
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]