Re: [xml] libxml2 thread safety (SUMMARY)



On Wed, Oct 03, 2001 at 09:56:20AM +0100, Gary Pennington wrote:
Secondly, I believe that it is possible to make libxml2 thread-safe and 
backwards compatible. Key to this belief is the use of TSD (as I 
described in my code samples yesterday). All global state flags will be 
moved into a TSD structure which means that each thread will have it's 
own view of the libxml2 configuration. The big plus of this approach is 
that multiple threads get private behaviour out of libxml2 without 
requiring any re-writing of user code. I'll repeat that, since it's 
importance is near the top of my list and no-one else seemed to be 
bothered about this. NO re-writing of code for existing single-threaded 
libxml2 consumers. The big negative of this approach is that you need to 
be very careful about behaviour when passing documents around between 
threads; since each thread will have it's own view of the libxml2 
configuration flags. I believe this is a small price to pay for 
backwards compatiblity, but it should be considered.

  I have a small concern around the "recompilation for threaded usage"
point. I think it is possible to make the changes in a backward compatible
way, and allow the same shared libraries to serve both kind of users.

Thirdly, If we do decide that backwards compatibility is not an issue. 

  Well clearly it is an issue. I can't start forking code now, I'm
not ready to work on libxml3 yet, libxml2 is expected to stabilize 
for the Gnome-2.0 release. And if (as I believe) we can provide a binary
compatible verion with thread support then this should be pursued.

Then there are many measures that can be taken to improve the library. 
New re-entrant APIs for setting state using user supplied storage, 
remove all global state, etc... However, I don't believe this is a 
realistic solution as it will cause major compatibility issues for the 
existing user base. Let's keep discussing this, but I believe that 
backwards compatibility is right at the top of the design goals for any 
change to the library.

  yes, definitely.

Fourthly, you are near the top of my list of people we should help 
(well, to be more precise - not hinder). I'm not going to provide any 
thread support to help you, but I certainly don't want to break any of 
your existing code. When the library is built for your platform, the 
thread specific code will be excluded from the compile and you will get 
a single threaded libxml2 that you can use as you choose. Key to this 
approach working is the adoption of TSD and the limited intrusiveness on 
libxml2 which I advocate.

  I still think we can come with a single mode solution.

Finally, this is an interesting question since it reveals the difference 
between making a library thread-safe and having a thread-safe convention 
for using the library. I believe that this approach will result in a 
thread-safe convention for using libxml2; whereas I would like to make 
the library thread-safe. Taking this approach means that you must trust 
your clients to behave properly, they mustn't try to manipulate global 
state whilst a parse is in progress for instance. I don't think that 
this will result in a library that can be used effectively by 
multi-threaded clients, in fact I have already considered this option at 
some length and discarded it before making this proposal. One very good 
reason for discarding this approach becomes evident when you consider 
what would happen if you tried to link your application with another 
library which also used libxml2; but directly - not following your 
synchronisation conventions or using your synchronisation wrapper layer. 
I think the answer to the question I posed is a clear no.

  One minute...
If the extra library hasn't been recompiled you can't guarantee any thread
safety anyway, right ? If it had been recompiled and there is only one
mode of access (i.e. removing the threaded vs. non-threaded way of getting
the information) then I don't see where the problem would come from.

 My personal inclination as I explained yesterday would be:
   - to keep all global settings in a new structure
   - that structure is accessed using an overridable function returning
     a pointer to that structure
   - the old name for the global variable is turned into a macro
     dereferencing the pointer returned by that function (I think
     this will keep the read/write capability of that property).

 Do you foresee any significant problem with this approach ? Any recompiled
library would work in the same mode as the application. There is a single
atomic step to switch to a threaded mode which is to set the global variable
containing the pointer to the accessor function.

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]