Re: [xml] loading concatenated documents



On Mon, Mar 29, 2010 at 11:24:59AM -0400, Ethan Tira-Thompson wrote:
http://www.w3.org/TR/REC-xml/#NT-document

[1]     document       ::=       prolog  element  Misc*

Also, I should argue in terms of the spec: Misc is defined as a combination of comments, processing 
instructions and/or whitespace.  AFAIK, It does not say anything about the end of file.

So in other words, libxml is assuming an additional restriction that it must
receive an EOF coinciding with the end of the document.

  Wrong, you can pass the data the way you want to libxml2, but you have
to indicate where the data ends or what the last chunks is.
  0 terminated string, explicity byte count, final indicator when using
chunked interfaces, special entry point,  etc ... it depends on the
I/O method used.

However, EOF is just one possible delimiter.  As far as I'm concerned,
it would be acceptable to mark the first non-Misc character as the end
of the document and complete parsing without raising a hissy fit.

  No any XML parser MUST report

"<foo/><foo/>"

as a not well formed document if passed this data. The XML parser
doesn't say what the document contains, the user has to provide this.
Failure to do so would just make the parser non-conformant to the
XML-1.0 specification.
If you don't need an XML parser, fine, but libxml2 *is* an XML parser.

  Stacking XML document without keeping track of boundaries *is* a design
error failure, unrelated to the parser being used.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]