Re: [xml] xmlReadFile/xmlReadMemory - Performance or Concurrency problem
- From: Martin Trappel <0xCDCDCDCD gmx at>
- To: xml gnome org
- Subject: Re: [xml] xmlReadFile/xmlReadMemory - Performance or Concurrency problem
- Date: Mon, 10 Nov 2008 13:04:06 +0100
Daniel Veillard wrote:
On Tue, Nov 04, 2008 at 10:14:48AM +0100, Martin Trappel wrote:
Hi there.
I could use a few wild guesses, because I've quite run out of them:
* libxml2 2.6.27
* Windows XPsp2
I have a process that is running approx 30 threads where one of this  
threads is doing some calculations and network communication with a  
hardware device. This one is the only thread that spends a measureable  
time doing anything, i.e. it takes about 10% cpu).
So far, no libxml2 involved.
Now, in an additional thread, I start up a libxml parser to parse a 4MB  
xml file. (When tested in isolation parsing of this file takes approx  
200-300ms).
In this process, the parsing (xmlReadFile, or xmlReadMemory call with  
file read into memory) takes btw. 2 sec and 12 sec. That ain't the  
problem and of course I expected it to take longer due to heavy load.
The problem now is, that the xmlRead* call takes 99% cpu resources which  
causes the other thread to slow down so much that it fails due to a  
fixed timeout for msg processing we have.
What is really interesting now is, that when I add some artificial  
cpu-load before or after xmlReadFile (some dummy calculations in a loop  
for 10 seconds) that takes up 99% cpu as well, but the msg processing in  
the other thread ain't aborted.
Could this be due to many heap-allocations from  
xmlReadFile/xmlReadMemory? Some other process global resource that could  
be the cause?
any guesses welcome!
  No idea. the Windows memory allocator gave us serious problems in the
past in face of realloc() use. Thread concurency may be a problem too.
As for realloc: As far as I could see no realloc calls are done during 
build-up of a tree (with xmlNewDoc, xmlNewDocNode, xmlAddChild, 
xmlAddProp, ...) and I also had these problems if my test-thread-code 
did just that.
I have finally found a solution for our performance problem.
It turned out that the problem was really rooted in Heap concurrency 
problems (in what manner exactly, I don't really dare say.)
I ended up giving libxml2 its own win32 Heap (HeapCreate(..)) via 
xmlMemSetup+xmlMemGcSetup plus simple fwder functions to the Heap*(..) 
WIN32 functions and now everything runs just fine.
To put the whole thing in context:
* With it's own heap, the parsing of a 6MB XML File takes <500ms in this 
specific environment.
* As it was, with only the process heap for everything, it took 2-5 
seconds on first run and then increased up to 11 seconds for subsequent 
runs of the same parsing code (xmlReadFile) in this process.
* We're running heavily multithreaded where a few threads use 
significant (10% or more) processor time.
* I was utterly unable to reproduce the behavior outside our application 
even with a few dummy threads thrown in.
Note that I now use a low fragmentation heap for libxml2 and that gives 
a slight advantage. (I see 5%+ faster parsing/tree building in a stand 
alone test app.)
So to sum up: There seem to be certain situations (under win32) where 
giving libxml2 a separate heap has significant performance advantage.
cheers,
Martin
[
Date Prev][
Date Next]   [
Thread Prev][
Thread Next]   
[
Thread Index]
[
Date Index]
[
Author Index]