Re: [xml] IO callbacks are not thread-safe

On Monday 06 April 2009 18:36:25 Michael Ludwig wrote:
Nick Wellnhofer schrieb:
The input and output callbacks of libxml are stored in static arrays
in xmlIO.c, so any use of the callback functions is not thread-safe.

If someone has time to explain this to the uninitiated: What are these
input and output callbacks of libxml? Or are they not part of the Perl
interface, just part of the C interface?

Hallo Michael,

This is a feature of libxml2, avalable in Perl bindings via the 
XML::LibXML::InputCallback module. The callbacks allow you to bypass the 
"normal" way libxml2 retrieves data from URLs. You can register callbacks that 
recognize which URL you want to handle differently and feed any content to the 
parser. You can use it to do things like URL rewriting or using special URL 
schemes to pull data from a database or other application specific resource.

In many cases this shouldn't be a problem, if callbacks are registered
only at the start of a program. But the Perl bindings register and
unregister callbacks every time a document is parsed. I can reproduce
random segfaults or other errors when processing many thousand
documents in concurrent threads with the libxml Perl bindings.

Two unrelated questions, just to satisfy my curiosity:

What are the benefits of processing documents concurrently?

The scenario discussed here is Apache web server using threads instead of 
forks to serve content to concurrent clients.

Or rather,
are there any without multiple processors? 

This depends on the type of your application; one thread can parse while other 
is waiting for an I/O.

And can you control the
number of processors to be engaged by Perl?

Well, these are not Perl threads (i.e threads within one Perl interpreter) we 
are talking about, these are multiple thread each of which can run its own 
Perl/PHP or whatever interpreter. 

Perl threads are a different and ugly beast: do not use them unless you want 
your application to get much much slower, which you don't. Also the support 
for Perl threads in XML::LibXML is very limitted due to very problematic 
memory management.

Could you post a sample of how to achieve this concurrent use of LibXML
in Perl?

For Perl threads, some examples are in XML::LibXML documentation in section 
THREAD SUPPORT. But trust me, you do not want to use them!

Please use perl-xml listserv activestate com for discussion specific to Perl-


-- Petr

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]