Re: [xml] libxml2 2.9.2 hangs running multi-threaded on Windows



  Thanks, that's now commited in git :-) !

  https://git.gnome.org/browse/libxml2/commit/?id=620a70615e68b30db1a80a993180a41dc24f12b9

Daniel

On Tue, Mar 03, 2015 at 09:55:30AM +0000, Steven Nairn wrote:
Hi,

After recently upgrading to 2.9.2 for use in our (multi-threaded)
program we started to experience occasional hangs on Windows (the
other platforms were fine).

Attaching to the hung process with gdb showed that all the threads
were waiting on the xmlDictMutex xmlRMutex. The count field of the
xmlRMutex was zero, which indicated that the mutex should not have
been locked. However, the cs field showed that the CriticalSection was
locked and was held by a thread that had completed. So, that thread
had locked the mutex but not unlocked it.

Eventually the problem was tracked down to the maintenance of the
count field. The relevant code fragments (with non-Windows stuff
removed) are:
----
typedef struct _xmlRMutex {
    CRITICAL_SECTION cs;
    unsigned int count;
} *xmlRMutexPtr;

void xmlRMutexLock(xmlRMutexPtr tok)
{
    EnterCriticalSection(&tok->cs);
    tok->count++;
}

void xmlRMutexUnlock(xmlRMutexPtr tok)
{
    if (tok->count > 0) {
        LeaveCriticalSection(&tok->cs);
        tok->count--;
    }
}
----

So, when locking the mutex the count field is incremented inside the
critical section but when unlocking the count field is decremented
outside the critical section. The increment/decrement is not atomic so
if one thread is locking the mutex while another is unlocking it the
count field might not be updated properly. This is what was happening
in our case, leading to a call to xmlRMutexUnlock not calling
LeaveCriticalSection.

The fix is simple. When unlocking the xmlRMutex decrement the count
field before leaving the critical section. That is:
----
void xmlRMutexUnlock(xmlRMutexPtr tok)
{
    if (tok->count > 0) {
        tok->count--;
        LeaveCriticalSection(&tok->cs);
    }
}
----

This problem was introduced in commit id
8854e4631844eac8dbae10cc32904f27d5268af7 for bug 737851. Prior to the
change the Windows CriticalSections were definitely not being left
properly when xmlRMutexes were used recursively. However, at least in
the way we use libxml2, that problem was masked since xmlRMutexes were
not used recursively.

I've added a comment to the bug in bugzilla.

Cheers,
Steve
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml

-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]