Re: [xml] xmlSaveFormatFileEnc - xml file with binary zeros



Daniel Veillard wrote:
On Fri, Jul 31, 2009 at 10:01:24AM +0200, Martin Trappel wrote:
Hi all.

At a customer site we had the case that one configuration file we always write from our app was 789KB of binary zeros (0x00 / '\0').

The call we use to save the file is:
xmlSaveFormatFileEnc(
  WCharToUTF8(m_csXMLFile).c_str(), // generate UTF8 name from CString
  m_xmldoc,
  XML_ENCODING_UTF8,
  1
);

I thought maybe someone has a creative idea what could have happened to our program state to create such a file through this call. (Of course the whole thing isn't reproducible :/ )

Note:
I have seen that saving the data ends up in xmlio.c:xmlFileWrite which calls fwrite(&buffer[0], len, 1, (FILE *) context) The call comes from xmlio.c:xmlOutputBufferFlush where the writecallback is invoked: ret = out->writecallback(out->context, (const char *)out->conv->content, out->conv->use);

So for libxml to actually generate such a file we'd need a valid xmlBufferPtr(out->conv) with a sufficiently large buffer that is zero'd out.
Not much chance for that, or??


Thx in adv for any ideas and pointers, libxml related or not!

in general my bet would be a bug in the conversion routines, it would be
useful to know what ended being used. On some systems it's dynamically
shared libraries which implement iconv and things can become really hard
to trace from there. But in this case, XML_ENCODING_UTF8, means the
internal tree encoding of libxml2, so encoding is a no-op and it looks
like the serialization buffer got overriden by zeroes. Very weird,
I have no idea how this could happen, if multithreaded a bug elsewhere
might override part of the buffer but getting only zeroes and not
crashing due to other data being tampered sounds unlikely. Last
possibility is a bug in the I/O system...


Yes, we use only UTF-8 output at the moment, so I guess there should be no conversion taking place inside libxml. We're equally without any idea here. I guess I have to hope it won't happen again, and if it really does then at least it will be worth spending serious time on detecting the issue ...

cheers,
Martin




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]