Re: [xml] threading and xmlReadFile



On Tue, May 14, 2013 at 12:14:53PM -0700, Jerry Cain wrote:
I am relatively new to libxml2, and I've stumbled across an issue around xmlReadFile and threading.  I am 
running on x86_64-linux, and I built and installed libxml2 myself, ensuring that the build was configured 
with --with-threads.  I am coding in C++11, using the C++11 thread package, which is itself implemented in 
terms of pthreads in g++4.6, which I'm currently using.  I've examined the bug database, and I've seen 
nothing to imply this general of a problem with threads and libxml2 exists.

I am seeing very occasional deadlock in an application with four threads—one main thread, which spawns 
three child threads, each of which attempts to parse a remote RSS document, as with:
[...]
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f0afc1e5d23 in xmlRMutexLock () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
#2  0x00007f0afc1e20b9 in ?? () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
#3  0x00007f0afc1e2648 in ?? () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
#4  0x00007f0afc1e35ff in xmlACatalogResolve () from /usr/lib/x86_64-linux-gnu/libxml2.so.2

  Can you recompile without optimization? Because without the trace
for the 2 functions called from xmlACatalogResolve() there is way too
much guessing.

  Somehow it's the catalog resolution code which seems to  deadlock.
Are you using the latest version ? Could you configure without catalog
support and see ?

  it is a bit surprizing as testThreads.c does this kind of heavy
  parallel testing and catalog access without raising any issue.

#5  0x00007f0afc19d7e3 in ?? () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
#6  0x00007f0afc1a0354 in ?? () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
#7  0x00007f0afc1a01df in xmlLoadExternalEntity () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
#8  0x00007f0afc186b36 in xmlCreateURLParserCtxt () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
#9  0x00007f0afc18cdaa in xmlReadFile () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
[...]
 
Frame 7's rbx register can be used to confirm the URL strings are different for each thread, so no two 
threads are unintentionally racing to parse the same remote RSS feed.

  Actually as long as the parser context are different you can test

(gdb) frame 7
#7  0x00007f0afc1a01df in xmlLoadExternalEntity () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
(gdb) print (char *)$rbx
$22 = 0x7f0af00022d0 "http://feeds.washingtonpost.com/rss/world";
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f0afa954700 (LWP 25902))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
162     in ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
(gdb) frame 7
#7  0x00007f0afc1a01df in xmlLoadExternalEntity () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
(gdb) print (char *)$rbx
$23 = 0x7f0aec0022c0 "http://feeds.latimes.com/latimes/news";
(gdb) thread 4
[Switching to thread 4 (Thread 0x7f0afb155700 (LWP 25901))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
162     in ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
(gdb) frame 7
#7  0x00007f0afc1a01df in xmlLoadExternalEntity () from /usr/lib/x86_64-linux-gnu/libxml2.so.2
(gdb) print (char *)$rbx
$24 = 0x7f0af4002350 "http://feeds.chicagotribune.com/chicagotribune/news/nationworld";

Of course, xmlRMutexLock and pthread_cond_wait are each at the bottom of the three thread stacks, so my 
(hopefully not incorrect) assumption is that they are all waiting to acquire the same mutex.

This happens very rarely, but I'm assuming it's happening because of a race I'm not protecting against.

  I assume you called xmlInitParser(), but the problem doesn't look there.
I need to know which mutext they are freezing on,

Daniel

-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]