Re: [xml] libxml2 thread safety (SUMMARY)
- From: Daniel Veillard <veillard redhat com>
- To: Gary Pennington <Gary Pennington uk sun com>
- Cc: xml gnome org
- Subject: Re: [xml] libxml2 thread safety (SUMMARY)
- Date: Wed, 3 Oct 2001 07:47:51 -0400
On Wed, Oct 03, 2001 at 09:56:20AM +0100, Gary Pennington wrote:
Secondly, I believe that it is possible to make libxml2 thread-safe and
backwards compatible. Key to this belief is the use of TSD (as I
described in my code samples yesterday). All global state flags will be
moved into a TSD structure which means that each thread will have it's
own view of the libxml2 configuration. The big plus of this approach is
that multiple threads get private behaviour out of libxml2 without
requiring any re-writing of user code. I'll repeat that, since it's
importance is near the top of my list and no-one else seemed to be
bothered about this. NO re-writing of code for existing single-threaded
libxml2 consumers. The big negative of this approach is that you need to
be very careful about behaviour when passing documents around between
threads; since each thread will have it's own view of the libxml2
configuration flags. I believe this is a small price to pay for
backwards compatiblity, but it should be considered.
I have a small concern around the "recompilation for threaded usage"
point. I think it is possible to make the changes in a backward compatible
way, and allow the same shared libraries to serve both kind of users.
Thirdly, If we do decide that backwards compatibility is not an issue.
Well clearly it is an issue. I can't start forking code now, I'm
not ready to work on libxml3 yet, libxml2 is expected to stabilize
for the Gnome-2.0 release. And if (as I believe) we can provide a binary
compatible verion with thread support then this should be pursued.
Then there are many measures that can be taken to improve the library.
New re-entrant APIs for setting state using user supplied storage,
remove all global state, etc... However, I don't believe this is a
realistic solution as it will cause major compatibility issues for the
existing user base. Let's keep discussing this, but I believe that
backwards compatibility is right at the top of the design goals for any
change to the library.
Fourthly, you are near the top of my list of people we should help
(well, to be more precise - not hinder). I'm not going to provide any
thread support to help you, but I certainly don't want to break any of
your existing code. When the library is built for your platform, the
thread specific code will be excluded from the compile and you will get
a single threaded libxml2 that you can use as you choose. Key to this
approach working is the adoption of TSD and the limited intrusiveness on
libxml2 which I advocate.
I still think we can come with a single mode solution.
Finally, this is an interesting question since it reveals the difference
between making a library thread-safe and having a thread-safe convention
for using the library. I believe that this approach will result in a
thread-safe convention for using libxml2; whereas I would like to make
the library thread-safe. Taking this approach means that you must trust
your clients to behave properly, they mustn't try to manipulate global
state whilst a parse is in progress for instance. I don't think that
this will result in a library that can be used effectively by
multi-threaded clients, in fact I have already considered this option at
some length and discarded it before making this proposal. One very good
reason for discarding this approach becomes evident when you consider
what would happen if you tried to link your application with another
library which also used libxml2; but directly - not following your
synchronisation conventions or using your synchronisation wrapper layer.
I think the answer to the question I posed is a clear no.
If the extra library hasn't been recompiled you can't guarantee any thread
safety anyway, right ? If it had been recompiled and there is only one
mode of access (i.e. removing the threaded vs. non-threaded way of getting
the information) then I don't see where the problem would come from.
My personal inclination as I explained yesterday would be:
- to keep all global settings in a new structure
- that structure is accessed using an overridable function returning
a pointer to that structure
- the old name for the global variable is turned into a macro
dereferencing the pointer returned by that function (I think
this will keep the read/write capability of that property).
Do you foresee any significant problem with this approach ? Any recompiled
library would work in the same mode as the application. There is a single
atomic step to switch to a threaded mode which is to set the global variable
containing the pointer to the accessor function.
Daniel Veillard | Red Hat Network http://redhat.com/products/network/
veillard redhat com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
] [Thread Prev