Re: [xml] Important: possible incompatible changes ahead for 2.9.0 !



On Mon, Aug 6, 2012 at 9:00 AM, Daniel Veillard <veillard redhat com> wrote:

    Hello everybody,

   As some of you following libxml2 git commits may have found
out, I pushed a number of patches to clean up libxml2 code on Friday.
Most of them were to deal with large input of data, some of those
changes added specific limits to parsing, like a maximum lenght
for an XML Name (or NmToken) maximum lookahead size for the parser
in push mode, etc ... All those affecting the parser can be deactivated
by using the XML_PARSE_HUGE parser option, like for the few other existing
parser limits.
  At the API level, I also had to make an incompatible change (but
with ABI compatibility !), for parser buffers. The problem is
that those buffers were using int instead of size_t for various size
leading to a variety of troubles including security ones. How to fix
that while keeping everything pblic API and ABI compatible ? Not doable
IMHO. So I did change one of the inner buffer structure of the parser
input and output to make them private, and fixed the issue there, but
there is still some applications who could still use those fields. One
was already reported inside of GNOME so I expect others to show up.

  The new buffer structure will be ABI compatible with the old ones,
i.e. the old code as compiled wil be able to work with the new one, as
the fields with the same values are in the same place in the new
structures. But the structure are now opaque and the few places where
the code was using it directly will need fixing.

I don't know / understand the details (I'm just a humble user), but
would this by any chance address the following issue?

    https://bugzilla.gnome.org/show_bug.cgi?id=325533 (xmlNode member
'line' is 16-bit integer, many XML files are longer than 65535 lines)

See also this report in the Debian tracker:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=445961

From what I understood of the issue comments, there was some
discussion back and forth (also about the lack of
mailinglist-discussion, but that's beside the point now), including
some talk about ABI incompatibility.

If this issue isn't addressed already, maybe this is an opportunity to
take this along?

This issue affects us mainly because when we're schema-validating some
big XML files with xmllint (files with around 130,000 lines -- and no,
those are not automatically generated :-), but accumulated
configuration (over 10 years) of a very big application). Validation
errors after line 65535 always get reported as line 65535, making it
sometimes hard to find the problem (if you've made multiple edits).

Thanks for all your efforts.

-- 
Johan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]