Re: [xml] Strange 0x0a char popping in generated XML



On Fri, Oct 03, 2014 at 11:15:30AM +0200, Jean-Philippe Jacoupy wrote:
On Fri, Oct 3, 2014 at 10:56 AM, Daniel Veillard <veillard redhat com>
wrote:

On Tue, Aug 26, 2014 at 12:30:01PM +0000, Jean-Philippe Jacoupy wrote:
Hello,

I'm using libxml2 and I have a strange behaviour.

I'm creating a full document in memory (using xmlTextWriter with a
xmlBuffer).

I have called xmlTextWriterSetIndent with 0 as parameter.

Whenever I get the buffer content (once I have called
xmlTextWriterEndDocument) I get strange 0x0a inserted:
 - 1 after the xml header
 - 1 after the end of the xml document


  It's not strange, that a new line character, which is present
as non-significant white space and will be ignored by XML parsers
and hence the whole tool chain consuming the output.

Daniel


Thanks for your response Daniel,

But I still think the presence of those \n is bogus.
Even if the XML parser will ignore the \n at reading (which is OK),
when you have to cypher the document, the extras '\n' changes the result.

Both of them prevent to generate a linearized xml. And that's the point of
this report.

I mean this isn't a linearized xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n<Document><data1
/><data2><data21 /></data2></Document>\n

Whereas this is a linearized xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Document><data1
/><data2><data21 /></data2></Document>

  If you want to sign the output, the is a canonical format and you
should use that. Libxml2 supports it !

Considering "linearized" that's severely bogus you mean you have
recipient who won't parse anything with a line feed in it ?
Where did that definition come from ? What happen if one of you data
field has a content with a \n inside ? *That* is the the broken part.

That's not an libxml2 issue , XML is here as a spec to define
interoperability, there is a number of place where that interop is
guaranteed even if you reformat the document, and there is equivalence
at the XML Infoset level. The two added \n are in those spaces.

Daniel



I'm under Windows compiling with VS2008 against LibXML2 version 2.7.2

PS:
- As I searched the code of the libxml2, at the end of the
xmlTextWriterStartDocument function I have found this:

count = xmlOutputBufferWriteString(writer->out, "?>\n"); (L. 617)

Shouldn't the '\n' be prefixed by a if (writer->indent) ?

 - Found the other one in xmlTextWriterEndDocument I have found:

if (!writer->indent) { (L. 701)

instead of

if (writer->indent) {

as done in all the file.

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml

--
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/




-- 
Cordialement,
JACOUPY Jean-Philippe

-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]