Re: [xml] xmlNodeDump performance



On Thu, Jun 7, 2012 at 6:25 PM, Chang Im <Chang IM watchguard com> wrote:
Hi,

I am new to libxml and XML itself.  We are using version 2.7.7.

Welcome :).

I am looking into performance issue related to xmlNodeDump of entire tree.

The size of XML file is about 20 MB and it takes about 60+ seconds with
xmlNodeDump.

Oprofiling showed memcpy as the hit.

This is tell-take sign that you're resizing a buffer too many times;
which is why making the buffer bigger helps.


Is this something needs to be tuned properly to support large XML files?

The easier solution is just to change the buffer allocation algorithm
to double the buffer size on each alloc. This guarantees at most
O(log(n)) resizes (instead of O(n) which you're seeing at the moment),
which gives you very good performance without having to think too
hard.

You can set the default with:

xmlSetBufferAllocationScheme(XML_BUFFER_ALLOC_DOUBLEIT);

There's a slight memory hit in this (in the very worst case the buffer
may sometimes end up being nearly twice as big as it needs to be) but
that's usually a very acceptable trade-off.

Conrad



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]