On Sun, Nov 26, 2006 at 09:33:09AM -0500, Elliotte Harold wrote:
What happens when libxml, invoked via xmlreader (itself invoked via 
PHP's XmlReader) detects a well-formedness error? How is the error 
reported to the client application?

  Either with the global default or with

In my experiments it seems that the read method merely returns false.

  no libxml2 always raise an error

that's true, is there a way to distinguish between this case and the 
simple end of the document?

  The reader end of document should not be a -1 return, but 0

A related question: Theoretically, the parser could report data up to 
the first error it finds. In my experiments with small documents, 
however, it actually errors out immediately.

  I think a lot of what you are seeing is specific to PHP for which
unfortunately I can't comment.

I suspect the underlying 
parser is preparsing a large chunk of the document, caching it, and then 
doling it out a piece at a time. Thus it tends to detect errors 
prematurely. Is this accurate?

  That's how libxml2 operates underneath.

If so, is there a limit to how much it will preparse? I assume it's not 
loading the whole document into a DOM first, and then iterating through 

  No unless you ask for it. The amount buffered depends on a number of factors
mostly the document, and poitentailly other things like RNG validation.


