[xml] Important: possible incompatible changes ahead for 2.9.0 !




    Hello everybody,

   As some of you following libxml2 git commits may have found
out, I pushed a number of patches to clean up libxml2 code on Friday.
Most of them were to deal with large input of data, some of those
changes added specific limits to parsing, like a maximum lenght
for an XML Name (or NmToken) maximum lookahead size for the parser
in push mode, etc ... All those affecting the parser can be deactivated
by using the XML_PARSE_HUGE parser option, like for the few other existing
parser limits.
  At the API level, I also had to make an incompatible change (but
with ABI compatibility !), for parser buffers. The problem is
that those buffers were using int instead of size_t for various size
leading to a variety of troubles including security ones. How to fix
that while keeping everything pblic API and ABI compatible ? Not doable
IMHO. So I did change one of the inner buffer structure of the parser
input and output to make them private, and fixed the issue there, but
there is still some applications who could still use those fields. One
was already reported inside of GNOME so I expect others to show up.

  The new buffer structure will be ABI compatible with the old ones,
i.e. the old code as compiled wil be able to work with the new one, as
the fields with the same values are in the same place in the new
structures. But the structure are now opaque and the few places where
the code was using it directly will need fixing.
  What I see from the usage there are for example access to xmlOutputBuffers:

  buf = xmlAllocOutputBuffer (NULL);
  ....dump stuff to the buffer...
  use data at buf->buffer->content, of size buf->buffer->use

First okay, that was allowed by the API, but such buffers were supposed
to be used for I/O and encoding conversion, in general accessing
buf->buffer->content and buf->buffer->use directly was not really the
expected way to do things. The fact that xmlOutputBuffer were not
supposed to be used that way is the reason why there is no accessors for
getting the output data, this is now fixed as of commit

  http://git.gnome.org/browse/libxml2/commit/?id=e258adecd0e19a6cfe6afa232b89aa416368820e

 So where there is such use of direct access, check the LIBXML2_NEW_BUFFER
macro and if present then
   - replace buf->buffer->content with xmlOutputBufferGetContent(buf)
   - replace buf->buffer->use with xmlOutputBufferGetSize(buf)

leading to something along those lines:
--- calendar/backends/caldav/e-cal-backend-caldav.c.orig        2012-08-06 12:39:16.368456121 +0800
+++ calendar/backends/caldav/e-cal-backend-caldav.c     2012-08-06 12:41:20.602442480 +0800
@@ -1792,11 +1792,19 @@ caldav_receive_schedule_outbox_url (ECal
        soup_message_headers_append (message->request_headers, "User-Agent", "Evolution/" VERSION);
        soup_message_headers_append (message->request_headers, "Depth", "0");
 
+#ifdef  LIBXML2_NEW_BUFFER
+       soup_message_set_request (message,
+                                 "application/xml",
+                                 SOUP_MEMORY_COPY,
+                                 (gchar *) xmlOutputBufferGetContent(buf),
+                                 xmlOutputBufferGetSize(buf));
+#else
        soup_message_set_request (message,
                                  "application/xml",
                                  SOUP_MEMORY_COPY,
                                  (gchar *) buf->buffer->content,
                                  buf->buffer->use);
+#endif
 
        /* Send the request now */
        send_and_handle_redirection (cbdav->priv->session, message, NULL);

  if in some place the xmlBufferPtr was passed independantly of the
OutputBuffer, it's possible to use xmlBufGetContent(buffer) and
xmlBufUse(buffer) to achieve the same. This should also work for
xmlParserInputBuffer access to buf->buffer. When tweaking the
XML parser input structures usually that access was done with
something like
   ctxt->input->buf->buffer
now buffer is an opaque structure and can't be defreferenced, but
   xmlBufGetContent(ctxt->input->buf->buffer)
   and
   xmlBufUse(ctxt->input->buf->buffer)
should be used if LIBXML2_NEW_BUFFER is defined.

  If more complex uses of those structure buffers requires more complex
treatment, please advise on the list, I can provide more methods for the
new buffer structure xmlBuf if needed.

   I have put a snapshot tarball libxml2-2.9.0-rc0.tar.gz (and rpms)
for people to have a try and raise issues with this change, I would
not expect much code to be affected but we have at least one example
already.

 If there are other places where the xmlOutputBuffer and
xmlParserInputBuffer changes in libxml2 git head give problems I'm
ready to help out.

  I don't plan to make an official release with the changes before
September, so there is a bit of time to get this all cleaned up, and
possibly refine the migration stategy for the few apps affected.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]