[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [xml] sax2 interface and parsing on the fly



On Mon, Apr 12, 2004 at 07:37:27PM +0200, aliban gmx net wrote:
> hi,
> for my sparetime project of the last 0.75 year i was building a XMPP 
> lib for usage on different platforms. it is not finished yet but 
> nearly done.
[...]
> Problem:
> For the stream analysation I wrote some kind of stream readed that is 
> in fact a sax interface that i randomly fed with new buffer 
> containing the new received elements (<message/> <presence/>)
> yesterday i downloaded a new build of libxml2 (windows binaries of 
> zlatkovic).
> I noticed that my lib did not behave as expected.
> in fact the startElement() function is not called anymore.
> whatever when i disconnect from xmpp server (the server sends a 
> </stream:stream> then!) the function gets called and all the data is 
> parsed then...

  You are generalizing on a small test case, the push parsing is
atill done on the fly, the change is that it may buffer data.

 http://bugzilla.gnome.org/show_bug.cgi?id=136466

> Therefor my current code seems not to execute parsing anymore. i 
> don't know why but there might be two explainations:
> 
> 1. you mutated the libxml2 sax2 parser to something like xmlreader 
> (and this can not parse xml on the fly anymore)
> or
> 2. you changed the behaviour of some function that in earlier version 
> caused the parser to parse. Probably xmlParseChunk()
> 
> It would be great if you can help me and it would be even more great 
> if you wont tell me that libxml2 (sax2) will not be able to parse on 
> the fly in future anymore.

This is a consequence of a real bug fix:

 http://bugzilla.gnome.org/show_bug.cgi?id=134566

  when pushing data the push parser has to find if a start element ends.
it was doing so by looking for < and > from the end of the buffer provided.
To garantee synchronous evaluation like you ask for, it would force the push
front-end to scan the buffer *fully* to be sure there is no evaluation
error on the start/end tag parsing boundaries. This is expensive, and
most applications do not need it.
  This is a trade off. Another API could be added for your need forcing
to flush the current available data.
  It is not a matter of parsing on the fly, it is a matter of expectations
put on top of the APIs that I never garanteed before. Adding the extra
flush API is possible, it is not my top priority. It's all about
xmlParseGetLasts() and xmlParseTryOrFinish() in parser.c

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]