Re: [xml] 2.9.0 performance regression



On Mon, Sep 17, 2012 at 02:44:33PM -0700, Jim Kukunas wrote:
Hi Folks, 

  Hi Jim and Arjan,

I've been trying to root cause a performance regression (~ 20%) when
upgrading from libxml2 2.8.0 to 2.9.0.

The benchmark I'm using is:
% xmllint --repeat --timing --stream 112M.xml > /dev/null

where 112M.xml is an 112 MB XML file generated by XMark's 
xmlgen utility.

Using git bisect, I've managed to chase the regression down to a series
of commits (65c7d3b2 - 145477D8).

Some data points by git commit id:

ade10f2c: 147072.4 ms +/- 202.7962 (5 Runs)

  ade10f2c57b4bd5c3812b96bce1144d8fa1d189e is in XPath and unrelated to
  parsing

18d0db25: 146525.2 ms +/- 411.8849 (5 Runs)

65c7d3b2 - 7b9b0719: build failed

145477D8: 184805.8 ms +/- 475.2044 (5 Runs)

  Well that's a switch to new buffer structures which should be 64bit
clean, the only extra processing would be related to the UPDATE_COMPAT
and CHECK_COMPAT macros used on entry and exit of the new buffer calls
to ensure ABI compatibility with the old buffers. One way to test if
this is indeed the case would be to comment off
  #define WITH_BUFFER_COMPAT
at the top of buf.c and recheck, can you do that and report ?
  If not i guess doing some profiling will be needed to try to
understand the cause of the slowdown, maybe the switch to the new
buffers introduced a glitch in the libxml2 I/O code leading to a
degradation of performances. Some direct pointer dereferencing
has been replaced by clean function calls, but it should not generate
a 20% hit on parsing, there is something else going on...

The XML file was served from an EXT4 partition. I also ran the tests with the 
XML file on a tmpfs partition, to reduce I/O impact, however the performance
relation between the commit ids was unchanged.

Has anyone run into this before?

  No, nobody complained so far, you're the first :-)

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]