Re: [xml] Parsing big xml data received by chunks from libcurl



Hi Daniel,

Thank you very much. It works.


Le mer. 2 déc. 2015 à 09:10, Daniel Veillard <veillard redhat com> a écrit :
On Tue, Dec 01, 2015 at 05:11:52PM +0000, David Boucher wrote:
> Hi the list,
>
> I use libcurl to get a big chunk of xml data.
>
> In the CURLOPT_WRITEFUNCTION call back, I have a piece of memory with xml
> data.
>
> The first time this callback is executed, we call xmlReaderNewMemory().
> Then we call xmlTextReaderRead() while the result is 1.
>
> The XML being splitted, the loop finishes to fail because it needs
> following datas...
>
> Thanks to xmlTextReaderByteConsumed, we are able to get data already read
> and then the piece of data not read.
>
> The next time the callback is called, we are able to build a new buffer
> containing :
> * datas not already read from the previous call
> * new data from the new call.
>
> My problem is here. I'm looking for a function that could change the buffer
> to read to continue to parse xml data. I have tried xmlReaderNewMemory(),
> but it fails...
>
> Maybe a such function does not exist, and maybe this idea to read different
> buffers of a same xml is a bad idea.
>
> Is there a better way ? What are your advices ?

  Err you want to use the push parser when you don't have all data available
at parser creation time.

  Get the xmllint.c program look at the code in parseAndPrintFile
which handle the push testing case, it does something  like

                res = fread(chars, 1, 4, f);
                if (res > 0) {
                    ctxt = xmlCreatePushParserCtxt(NULL, NULL,
                                chars, res, filename);
                    xmlCtxtUseOptions(ctxt, options);
                    while ((res = fread(chars, 1, size, f)) > 0) {
                        xmlParseChunk(ctxt, chars, res, 0);
                    }
                    xmlParseChunk(ctxt, chars, 0, 1);
                    doc = ctxt->myDoc;
                    ret = ctxt->wellFormed;
                    xmlFreeParserCtxt(ctxt);
                    if (!ret) {
                        xmlFreeDoc(doc);
                        doc = NULL;

  You create the parser context with first 4 bytes of your stream,
define which options you want to use, and then xmlParseChunk( ... 0)
for each part until you reach the end where you do xmlParseChunk( ... 1)

Daniel

> Thanks a lot.
> Regards.
> David.

> _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> xml gnome org
> https://mail.gnome.org/mailman/listinfo/xml


--
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]