Re: [xml] xmlreader error detection
- From: Daniel Veillard <veillard redhat com>
- To: Elliotte Harold <elharo metalab unc edu>
- Cc: xml gnome org
- Subject: Re: [xml] xmlreader error detection
- Date: Sun, 26 Nov 2006 11:43:05 -0500
On Sun, Nov 26, 2006 at 09:33:09AM -0500, Elliotte Harold wrote:
What happens when libxml, invoked via xmlreader (itself invoked via
PHP's XmlReader) detects a well-formedness error? How is the error
reported to the client application?
Either with the global default or with
http://xmlsoft.org/html/libxml-xmlreader.html#xmlTextReaderSetErrorHandler
In my experiments it seems that the read method merely returns false.
no libxml2 always raise an error
If
that's true, is there a way to distinguish between this case and the
simple end of the document?
The reader end of document should not be a -1 return, but 0
A related question: Theoretically, the parser could report data up to
the first error it finds. In my experiments with small documents,
however, it actually errors out immediately.
I think a lot of what you are seeing is specific to PHP for which
unfortunately I can't comment.
I suspect the underlying
parser is preparsing a large chunk of the document, caching it, and then
doling it out a piece at a time. Thus it tends to detect errors
prematurely. Is this accurate?
That's how libxml2 operates underneath.
If so, is there a limit to how much it will preparse? I assume it's not
loading the whole document into a DOM first, and then iterating through
that.
No unless you ask for it. The amount buffered depends on a number of factors
mostly the document, and poitentailly other things like RNG validation.
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]