Re: [xml] xmlreader and chunked parsing?



On Sat, Nov 01, 2003 at 08:25:59AM +0000, Nick Kew wrote:

I regularly use the SAX API with chunked parsing ( [x|ht]mlParseChunk
and family), as this is ideal for a pipelined processing environment
where data are naturally available in chunks.

I've had a brief look at xmlreader with a view to considering it as
an alternative, but I haven't found anything similar.  I could in
principle use it with something like

while ( ! end ) {
  status = [ process something in xmlreader ]
  switch ( status ) {
    case OK: [ process and continue ]
    case Out of Data:
      [ if the parser internally remains in a consistent
        state then we can feed it another chunk and continue ]
    other: [ handle error ]
  }
}

The crucial question is: can I catch out-of-data whilst preserving
internal parser state, and without significant overhead?  Is this
realistic, or would I be wasting my time trying?

  Can't you rather generate you own I/O routine (read, close) and
use the standard I/O wrapper mechanism ?
  xmlTextReaderPtr
  xmlReaderForIO(xmlInputReadCallback ioread, xmlInputCloseCallback ioclose,
                 void *ioctx, const char *URL, const char *encoding,
                                int options)
The point of the xmlReader is precisely to simplify the consuming loop,
if you start adding again the I/O condition handling to that loop, IMHO
you loose most of the benefits of the xmlReader API.

  Internally a reader is based on a parser in chunked parsing mode, but
breaking the API to expose those condition to the event loop doesn't sound
wise to me, how is the I/O approach not right w.r.t. your problem space ?

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]