[xml] loading concatenated documents



Hi all,

I'd like to load a series of concatenated XML documents (it's a stream of sensor packets from a simulation).

Ideally, I would like to use xmlParseDocument() with a context from xmlCreateIOParserCtxt() which then uses a 
set of read and close callbacks to pull the data out of a C++ istream.

libxml doesn't like this because it pulls a chunk (4kB) of data at a time, and unless I do something smart in 
the read function to cut it off at the end of one document, when the parser sees the start of the next 
document it complains "Extra content at the end of the document".  Regardless of this being an error, I then 
need to push the unused portion from the buffer back into the stream so the next document parse can pick it 
up.

One solution is to wrap everything with a new root node and use SAX endElement callbacks to trigger the data 
processing.  I find this kind of ugly, I'd prefer to be able to jump into the stream without needed a 
"header" just to make the parser happy.  I'd also rather use the tree interface at the end of each packet and 
avoid doing much SAX.

The solution I'm considering is to have a SAX endElement which watches for the end of a root node, then puts 
the extra data back into the stream and marks the input complete, such as:
   for(char* x=ctxt->input->end; x!=ctxt->input->cur; )
       istream.putback(*--x);
   ctxt->input->end=ctxt->input->cur;
   done=true; // set a flag so next read callback returns -1 to end the parsing

Is this a horrible hack?  Is there a better way, like a "load fragment" type of function that wouldn't 
complain about the extra content?

Thanks,
 -Ethan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]