[xml] libxml violates the zlib interface and crashes
- From: Mark Adler <madler alumni caltech edu>
- To: xml gnome org
- Cc: Mark Brown <broonie sirena org uk>
- Subject: [xml] libxml violates the zlib interface and crashes
- Date: Mon, 18 Jan 2010 01:49:30 -0800
libxml authors and/or maintainers,
I am preparing a new release of zlib which includes a rewrite of the gz* functions (to wit, that read and
write gzip files). In the process of testing, Mark Brown discovered that libxml encounters a segmentation
fault with the current beta versions of zlib (1.2.3.5 and later). Within libxml version 2.7.6 in xmlIO.c we
find this offending code:
#ifdef HAVE_ZLIB_H
if ((xmlInputCallbackTable[i].opencallback == xmlGzfileOpen) &&
(strcmp(URI, "-") != 0)) {
if (((z_stream *)context)->avail_in > 4) {
char *cptr, buff4[4];
cptr = (char *) ((z_stream *)context)->next_in;
if (gzread(context, buff4, 4) == 4) {
if (strncmp(buff4, cptr, 4) == 0)
ret->compressed = 0;
else
ret->compressed = 1;
gzrewind(context);
}
}
}
#endif
This appears to be an attempt to determine if the input is a gzip file or not. It does so by finding and
comparing bytes from an internal zlib buffer to the data returned by zlib, using "(z_stream *)context" to
access the internal buffer, where "context" is zlib's otherwise opaque gzFile structure.
This code violates the zlib interface, penetrating the protected gzFile structure. (Gee, this is starting to
sound like a sexual assault.) libxml assumes things about zlib that are not documented in the interface
(zlib.h) and so cannot be relied upon and must not be used in code that professes to be dependable. This
includes the assumption that gzFile has the z_stream it uses in the structure as the first element, that it
uses the z_stream next_in pointer for the input buffer, and that gzopen() or gzdopen() reads data from the
input to determine whether it is a gzip file or not. None of these assumptions are true in the new gz* code
in the current beta of zlib, and so libxml crashes and burns.
It turns out that this offense is not necessary since the current production version of zlib, 1.2.3,
available since July of 2005 and widely deployed, has the gzdirect() function for exactly this purpose.
Since both libxml and zlib are widely deployed, and since I expect that applications usually link to it and
zlib dynamically, we have a problem.
The fix to libxml is to check the version of zlib, and to use the current code for ZLIB_VERNUM less than
0x1230, and to instead use gzdirect() for ZLIB_VERNUM greater than or equal to 0x1230. However, older
versions of libxml without this fix will fail with newer versions of zlib. E.g. if zlib is updated and
libxml is not.
An untested patch to illustrate the approach is provided below.
Mark Adler
--- xmlIO-2.7.6.c 2009-09-24 08:32:00.000000000 -0700
+++ xmlIO.c 2010-01-18 01:30:27.000000000 -0800
@@ -2518,6 +2518,9 @@
#ifdef HAVE_ZLIB_H
if ((xmlInputCallbackTable[i].opencallback == xmlGzfileOpen) &&
(strcmp(URI, "-") != 0)) {
+#if defined(ZLIB_VERNUM) && ZLIB_VERNUM >= 0x1230
+ ret->compressed = !gzdirect(context);
+#else
if (((z_stream *)context)->avail_in > 4) {
char *cptr, buff4[4];
cptr = (char *) ((z_stream *)context)->next_in;
@@ -2529,6 +2532,7 @@
gzrewind(context);
}
}
+#endif
}
#endif
}
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]