Re: [xml] libxml2 thread safety (SUMMARY)
- From: Gary Pennington <Gary Pennington uk sun com>
- To: veillard redhat com
- Cc: xml gnome org
- Subject: Re: [xml] libxml2 thread safety (SUMMARY)
- Date: Thu, 04 Oct 2001 11:43:26 +0100
Daniel Veillard wrote:
On Thu, Oct 04, 2001 at 10:13:38AM +0100, Gary Pennington wrote:
Here's the appropriate snippet (relating to a single variable
xmlDoValidityCheckingDefaultValue for puposes of illustration, there
would be one of these for each global that was modified) from the header
file to show how it works:
#if defined(_REENTRANT) || (_POSIX_C_SOURCE - 0 >= 199506L)
What I dislike in this solution is that we end-up with 2 different libraries
one when one compiles with _REENTRANT and one without this. From a packaging
and maintainance point of view this is no fun at all.
Agreed, but that's the cost of backwards compatibility. I'll try to keep
the intrusion down to a minimum and re-locate the global variables into
a separate file where they can be easily identified.
extern int *__xmlDoValidityCheckingDefaultValue();
#define xmlDoValidityCheckingDefaultValue \
(*(__xmlDoValidityCheckingDefaultValue()))
#else
extern int xmlDoValidityCheckingDefaultValue;
#endif
Any new code compiled would either be threaded or non-threaded. It would
pick up the macro definition if threaded, the standard extern symbol if
not. Existing code would still use the standard extern symbol and would
thus behave as non-threaded code.
The implementation of __xmlDoValidityCheckingDefaultValue() shows how
the value of the TSD is manipulated
#undef xmlDoValidityCheckingDefaultValue
int xmlDoValidityCheckingDefaultValue = 0;
int *
__xmlDoValidityCheckingDefaultValue()
{
struct glob_struct *globalval;
if (keyonce == 0) {
(void) pthread_mutex_lock(&keylock);
if (keyonce == 0) {
keyonce++;
(void) pthread_key_create(&globalkey, tsd_free);
}
(void) pthread_mutex_unlock(&keylock);
}
if ((globalval = (struct glob_struct
*)pthread_getspecific(globalkey))
== NULL) {
struct glob_struct *tsd = alloc_glob_struct();
pthread_setspecific(globalkey, tsd);
return (&tsd->xmlDoValidityCheckingDefaultValue);
} else
return (&globalval->xmlDoValidityCheckingDefaultValue);
}
Okay, I would not call this beautiful but it is acceptable if contained
to a limited area. There is also 2 places where I would need to take a lock
in other places in the code. I suggest the following:
- sepearate the global state definition in a state.h header
- move all this code and definitions of the global variables in
a state.c module
- export the locking as a libxml function and data to help portability,
this is not a full encapsulation of a thread system, just the
create/lock/unlock/destroy of a lock since it will be used in other
places in libxml (and possibly libxslt too).
I would propose adding two functions to support locking/unlocking of the
parser context. I don't think that I will export the lock create/destroy
since that is internal to the library and should have nothing to do with
any consumers. I'll think more about the utility of internal
functions/macros to do this.
I really think that the above is only a temporary measure until a
"proper" re-working of libxml (i.e. libxml3) can be performed. It's a
solution that promotes backwards-compatibility, however it's not as
clean as doing the reworking. Some of the work required to remove global
state would be performed which would help in the migration to libxml3
and that would mean phased implementation of the threading support,
which is either a good or a bad thing depending on your point of view.
This is fine by me at least.
We could decide that this is not worth doing and that we should just
wait for libxml3 and do a non-backwards compatible reworking. I have
some (limited) time to work on making this happen, but I'd be happy to
defer and wait to do a proper reworking for libxml2 if I knew what the
time frame for that project was.
Go ahead. I will integrate it, I probably won't make it the default
(i.e. it will require a configure flag) but once the capacity will be there
we will get more feedback.
Ok.
Anyway, I have a series of test programs using the above partial
implementation which work fine single threaded - I still need to write a
more comprehensive mt-driver (I've run through the testall target) and
the main (technically) tricky bits remaining are :
1. Deciding exactly which global symbols go into the global symbol
structure. All global variables or only those we expect to be modified?
I would go for the latter. For example the predefined entities list
is global and should be shared by all threads, I would not put it in
the structure. Other variables are harder, like the Catalog ones, it
could make sense to maintain them per thread but they are costly to build,
this is not simple to decide.
I will need some help here. I have a candidate list (which is all the
exported clearly non-static variables in 2.4.5) could you go through
this list and indicate to me which globals should be thread specific.
Here's the list (ignore the undef):
#undef docbDefaultSAXHandler
#undef htmlDefaultSAXHandler
#undef oldXMLWDcompatibility
#undef xmlBufferAllocScheme
#undef xmlDefaultBufferSize
#undef xmlDefaultSAXHandler
#undef xmlDefaultSAXLocator
#undef xmlDoValidityCheckingDefaultValue
#undef xmlFree
#undef xmlGenericError
#undef xmlGenericErrorContext
#undef xmlGetWarningsDefaultValue
#undef xmlIndentTreeOutput
#undef xmlKeepBlanksDefaultValue
#undef xmlLineNumbersDefaultValue
#undef xmlLoadExtDtdDefaultValue
#undef xmlMalloc
#undef xmlMemStrdup
#undef xmlParserDebugEntities
#undef xmlParserVersion
#undef xmlPedanticParserDefaultValue
#undef xmlRealloc
#undef xmlSaveNoEmptyTags
#undef xmlStringComment
#undef xmlStringText
#undef xmlStringTextNoenc
#undef xmlSubstituteEntitiesDefaultValue
I know that some of them are clearly not meant to be re-defined on a
per-thread basis, e.g. xmlMalloc but I've left them in for completeness
sake. Could you indicate which variables should go in the structure in
your opinion? I will assume that any variables which aren't intended to
be changed by consumers are the ones left over and they will be clearly
documented as such when this work is done.
A consequence of this work is that no more global state should be
introduced into libxml2, at least not unless it's completely
unavoidable. We want to minimise the amount of global state to make
movement to libxml3 as painless as possible.
2. Working out the appropriate configure incantation to make thread
support optional even if detected on the platform. (If we decide we want
this to be an option)
definitely.
Non-technically, writing the docs will be extremely important too.
Daniel
Gary
--
Gary Pennington
Solaris Kernel Development,
Sun Microsystems
Gary Pennington sun com
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]