Re: [xml] xmlReadFile/xmlReadMemory - Performance or Concurrency problem
- From: Martin Trappel <0xCDCDCDCD gmx at>
- To: xml gnome org
- Subject: Re: [xml] xmlReadFile/xmlReadMemory - Performance or Concurrency problem
- Date: Mon, 10 Nov 2008 13:04:06 +0100
Daniel Veillard wrote:
On Tue, Nov 04, 2008 at 10:14:48AM +0100, Martin Trappel wrote:
Hi there.
I could use a few wild guesses, because I've quite run out of them:
* libxml2 2.6.27
* Windows XPsp2
I have a process that is running approx 30 threads where one of this
threads is doing some calculations and network communication with a
hardware device. This one is the only thread that spends a measureable
time doing anything, i.e. it takes about 10% cpu).
So far, no libxml2 involved.
Now, in an additional thread, I start up a libxml parser to parse a 4MB
xml file. (When tested in isolation parsing of this file takes approx
200-300ms).
In this process, the parsing (xmlReadFile, or xmlReadMemory call with
file read into memory) takes btw. 2 sec and 12 sec. That ain't the
problem and of course I expected it to take longer due to heavy load.
The problem now is, that the xmlRead* call takes 99% cpu resources which
causes the other thread to slow down so much that it fails due to a
fixed timeout for msg processing we have.
What is really interesting now is, that when I add some artificial
cpu-load before or after xmlReadFile (some dummy calculations in a loop
for 10 seconds) that takes up 99% cpu as well, but the msg processing in
the other thread ain't aborted.
Could this be due to many heap-allocations from
xmlReadFile/xmlReadMemory? Some other process global resource that could
be the cause?
any guesses welcome!
No idea. the Windows memory allocator gave us serious problems in the
past in face of realloc() use. Thread concurency may be a problem too.
As for realloc: As far as I could see no realloc calls are done during
build-up of a tree (with xmlNewDoc, xmlNewDocNode, xmlAddChild,
xmlAddProp, ...) and I also had these problems if my test-thread-code
did just that.
I have finally found a solution for our performance problem.
It turned out that the problem was really rooted in Heap concurrency
problems (in what manner exactly, I don't really dare say.)
I ended up giving libxml2 its own win32 Heap (HeapCreate(..)) via
xmlMemSetup+xmlMemGcSetup plus simple fwder functions to the Heap*(..)
WIN32 functions and now everything runs just fine.
To put the whole thing in context:
* With it's own heap, the parsing of a 6MB XML File takes <500ms in this
specific environment.
* As it was, with only the process heap for everything, it took 2-5
seconds on first run and then increased up to 11 seconds for subsequent
runs of the same parsing code (xmlReadFile) in this process.
* We're running heavily multithreaded where a few threads use
significant (10% or more) processor time.
* I was utterly unable to reproduce the behavior outside our application
even with a few dummy threads thrown in.
Note that I now use a low fragmentation heap for libxml2 and that gives
a slight advantage. (I see 5%+ faster parsing/tree building in a stand
alone test app.)
So to sum up: There seem to be certain situations (under win32) where
giving libxml2 a separate heap has significant performance advantage.
cheers,
Martin
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]