Re: [xml] First experiments with threading



Daniel Veillard wrote:

 I integrated Gary's patches, and started doing some tests.
I built the library with and without threading on Linux and used
xmllint, i.e. an application with a single thread to test side
effects and evaluate the impact on performances:

  the test is to parse (and possibly validate) the XML spec 100 times
    xmllint --noout --repeat test/XML/test/valid/REC-xml-19980210.xml
  and
    xmllint --valid --noout --repeat test/XML/test/valid/REC-xml-19980210.xml

           Non threaded            Threaded

normal      3720ms                  5857ms
valid       6712ms                  9781ms

 I made the tests on an SMP box running Linux Red Hat with 2.4.7 kernel.
I think the impact comes from the access to the memory routines, libxml
is extremely aggressive on the memory allocator, and in the current code
the xmlMalloc/xmlRealloc/xmlFree are part of the per-thread data. This
mean that we are currently calling pthread_self() in addition to the
existing routine each time we access them and this reflects on performances.

On Solaris 8, I'm seeing a different picture.

I've run the version built with the patches I submitted against 2.4.5 and I see a 12/12% degradation. You are showing 57/46%, which is a lot higher and also not a consistent degradation.

I guess that this could be down to the difference between the Solaris/Linux threading models or it could be partly as a result of changes in the code above and beyond the patches I submitted.

I think that to make sure we are comparing like with like, I should build against your modifications. Is there a CVS incantation that I can use to pull down the same code that you tested with or can I just get the latest code from the cvs repository?




 My current view are:
  1/ that threaded mode should not be the default configuration

I agree. Lost of people don't need thread support and would rather not pay the performance price (which will always exist no matter how much we try to minimize it).


  2/ that by default xmlMalloc/xmlRealloc/xmlFree should be kept
     application wide settings

This might be wise anyway, since a thread passing a memory pointer to another thread may have problems if the memory allocator used by each thread is significantly different.


  3/ that it shall be relatively straightforward to make them
     thread specific with the use of a dedicated #define
     this can still be useful for other thread models the
     equivalent of pthread_self() is real cheap.

The code is already written. Could you make it configurable option for the brave of heart?



However I remember getting mail asking "per thread" allocator
so that one could build an zero cost deallocator, and clearly
this is not a good idea. I turned down the request at that point
and now I have data to back up my position :-)

See above. It might not be such a bad thing for some people.



 I will work on 2/ to check if my analysis of the added cost was right,
and will do 3/ if it is confirmed. Then I will cleanup the couple of places
where libxml code need locking and add some threading regression tests.

tests

Gary

--
Gary Pennington
Solaris Kernel Development,
Sun Microsystems
Gary Pennington sun com






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]