Performance Patches (was Re: [xml] too many mallocs?)



Hi Daniel, All,

As I have to understand the inner workings of libxml2 anyway, I can as 
well try to look for some performance patches (given the fact that
it will run on some 100MHz AMD Elan PC104 boards soon).

As this is scheduled to be a weekend hobby mostly, progress may be 
slow.

Two logistical difficulties: 
1) I'm still not able to CVS through our firewall and I've found no way
to get plain source files from Bonsai, so I would like to use the
snapshot tarballs. But the tarball linked from www.xmlsoft.org is dated
2002-03-07 - what's going on there?

2) You have packed zillions of test files under directory test, but I
didn't find something like an automatic harness to execute the
tests. Does something like this exists? It would be most usefull
for rapidly discovering broken optimizations.

Some preliminary comments (and questions):

a) The "too many malloc problem"
It seems to me, that the different layers don't conspire enough
to save (time and memory) resources. This problem is worsened,
as the function signature of some layers are fixed (SAX). In effect
every layer tends to allocate a new copy of the data.
The largest gain so fair in my tests came from adding versions of 
xmlNewDocNode, xmlNewNode and xmlNewNsProp
which take ownership of the 'name' string instead of strduping it.
Another idea is, as libxml2 is essentially acting as if strduping
is cheap, can we make it really cheap by going to reference
counted strings? I'm not clear on this, and anyway, it would have
to wait for libxml3.

b) Macros and the the ctxt->token case
I'm still not positive about the macros (but I may have
erred regarding the multiple returns). They hide their
cost from the programmer but not from the processor.
RAW and CUR are used about 350 times in parser.c
and parserinternals.c - eliminating the ctxt->token
test (and so replacing both of them with a simple
*ctxt->input->cur) was the greatest single factor for
decreasing the binary's size and the second largest
for increasing speed.
So that raises the question, whether ctxt->token can
be buried as a relict from the past. As I see it, all
nontrivial uses of it are already commented out (for
example in xmlParserHandleReference). The only
remaining assignment of other values than 0, is
the assignment of ' ' in xmlParserHandlePEReference
and xmlParsePEReference, and even there its use
is suspect if read the TODO comment right.
Can't we stuff the ' ' directly into the buffer if it is
really needed?

c) UTF8 conversion
I'm wondering whether parser.c can be changed to always working
on xmlChars and all costly conversion to 32bit UNICODE codepoints
can be avoided. My first impression is, that only the NameChar and
NameStartChar checking is really bothered about UNICODE codepoints
and these checks could be replaced by multi-level table lookups 
of the UTF8 bytes.

d) ctxt-sax checking
Another candidate for elimination are the about 250 cases of checking 
ctxt->sax before calling the SAX callback. I'm under the impression, that
ctxt->sax is always not zero, when I look in xmlSAXParseFile and
xmlSAXParseMemory. Also when changing the line
 ctxt->sax = sax
to
 *ctxt->sax = *sax
we are free to change the zeros in the SAX callback struc to NOP 
handlers, and the tests for the callbacks being zero can be eliminated
too. Finally, when disabling the sax callback by overwriting all
entries with NOP handlers, even the tests of ctxt->disableSAX can be
eliminated. 

Regards,
Peter Jacobi








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]