Re: [xml] Strange 0x0a char popping in generated XML



On Fri, Oct 03, 2014 at 02:38:21PM +0200, Jean-Philippe Jacoupy wrote:
The protocol, that use xml, that I have to work with refuses those.

They don't use the signing method defined in the XML because they specified
their own method for certification.

  then it's not an XML compliant protocol, that's pretty bad design,
would you name the name of that protocol so that some public shame
can be casted on those who pushed for it ?

The point is when you use xmlTextWriterSetIndent(writer, 0)
before EVEN starting your document, I expect to get no indentation as
stated here
<http://xmlsoft.org/html/libxml-xmlwriter.html#xmlTextWriterSetIndent>
( http://xmlsoft.org/html/libxml-xmlwriter.html#xmlTextWriterSetIndent )

The first '\n' is indentation.

I modified the lib, but still reporting something that seems wrong for me.
You'll find the patch as attachment and below.

  Sorry I won't change that behaviour. This will break people expecting
those for example in regression tests.
  indent is about adding those space in significant content it's
actually not the default, what you are doing is changing part where
it's not supposed to be significant.

  See for example the differences on xmllint with --pretty
for the value of 0 (no change), 1 (change adding significant content)
and 2 (change adding non significant content)

--- libxml2-2.7.2/xmlwriter.c    Tue Mar 11 23:54:05 2008
+++ libxml2-2.7.2/xmlwriter.c    Fri Aug 29 16:38:31 2014
@@ -614,7 +614,12 @@
         sum += count;
     }

-    count = xmlOutputBufferWriteString(writer->out, "?>\n");
+    if (writer->indent) {
+        count = xmlOutputBufferWriteString(writer->out, "?>\n");
+    }
+    else {
+        count = xmlOutputBufferWriteString(writer->out, "?>");
+    }
     if (count < 0)
         return -1;
     sum += count;

As for the other one I remove it inside my code.

  The writer might be able to save without the XMLDecl which would could
then add by yourself without that line feed

Daniel

On Fri, Oct 3, 2014 at 1:11 PM, Daniel Veillard <veillard redhat com> wrote:

On Fri, Oct 03, 2014 at 11:15:30AM +0200, Jean-Philippe Jacoupy wrote:
On Fri, Oct 3, 2014 at 10:56 AM, Daniel Veillard <veillard redhat com>
wrote:

On Tue, Aug 26, 2014 at 12:30:01PM +0000, Jean-Philippe Jacoupy wrote:
Hello,

I'm using libxml2 and I have a strange behaviour.

I'm creating a full document in memory (using xmlTextWriter with a
xmlBuffer).

I have called xmlTextWriterSetIndent with 0 as parameter.

Whenever I get the buffer content (once I have called
xmlTextWriterEndDocument) I get strange 0x0a inserted:
 - 1 after the xml header
 - 1 after the end of the xml document


  It's not strange, that a new line character, which is present
as non-significant white space and will be ignored by XML parsers
and hence the whole tool chain consuming the output.

Daniel


Thanks for your response Daniel,

But I still think the presence of those \n is bogus.
Even if the XML parser will ignore the \n at reading (which is OK),
when you have to cypher the document, the extras '\n' changes the result.

Both of them prevent to generate a linearized xml. And that's the point
of
this report.

I mean this isn't a linearized xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n<Document><data1
/><data2><data21 /></data2></Document>\n

Whereas this is a linearized xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Document><data1
/><data2><data21 /></data2></Document>

  If you want to sign the output, the is a canonical format and you
should use that. Libxml2 supports it !

Considering "linearized" that's severely bogus you mean you have
recipient who won't parse anything with a line feed in it ?
Where did that definition come from ? What happen if one of you data
field has a content with a \n inside ? *That* is the the broken part.

That's not an libxml2 issue , XML is here as a spec to define
interoperability, there is a number of place where that interop is
guaranteed even if you reformat the document, and there is equivalence
at the XML Infoset level. The two added \n are in those spaces.

Daniel



I'm under Windows compiling with VS2008 against LibXML2 version 2.7.2

PS:
- As I searched the code of the libxml2, at the end of the
xmlTextWriterStartDocument function I have found this:

count = xmlOutputBufferWriteString(writer->out, "?>\n"); (L. 617)

Shouldn't the '\n' be prefixed by a if (writer->indent) ?

 - Found the other one in xmlTextWriterEndDocument I have found:

if (!writer->indent) { (L. 701)

instead of

if (writer->indent) {

as done in all the file.

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml

--
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit
http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/




--
Cordialement,
JACOUPY Jean-Philippe

--
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/




-- 
Cordialement,
JACOUPY Jean-Philippe



-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]