Re: Performance Patches (was Re: [xml] too many mallocs?)

From: Daniel Veillard <veillard redhat com>
To: Peter Jacobi <pj walter-graphtek com>
Cc: xml gnome org
Subject: Re: Performance Patches (was Re: [xml] too many mallocs?)
Date: Mon, 27 May 2002 07:34:07 -0400

On Mon, May 27, 2002 at 10:52:50AM +0200, Peter Jacobi wrote:

Hi Daniel, All,

As I have to understand the inner workings of libxml2 anyway, I can as 
well try to look for some performance patches (given the fact that
it will run on some 100MHz AMD Elan PC104 boards soon).

As this is scheduled to be a weekend hobby mostly, progress may be 
slow.


  No worries, Garry Pennington is also doing some performance work
in the background, I'm actually quite happy to see people learning
the internals, the larger the pool of knowledgeable people about the
code the safer the project !

Two logistical difficulties: 
1) I'm still not able to CVS through our firewall and I've found no way
to get plain source files from Bonsai, so I would like to use the
snapshot tarballs. But the tarball linked from www.xmlsoft.org is dated
2002-03-07 - what's going on there?


  For some reasons the cron job ain't working, I relaunched it manually
  -rw-r--r--    1 veillard www       4523125 May 27 06:34 cvs-snapshot.tar.gz

2) You have packed zillions of test files under directory test, but I
didn't find something like an automatic harness to execute the
tests. Does something like this exists? It would be most usefull
for rapidly discovering broken optimizations.


  make tests in the Unix makefiles. I also use the check-xml-test-suite.py
Python based script to check against the W3C/NIST XML testsuite, the only
pointer I have at the moment is the public mail archives
   http://lists.w3.org/Archives/Public/public-xml-testsuite/

Some preliminary comments (and questions):

a) The "too many malloc problem"
It seems to me, that the different layers don't conspire enough
to save (time and memory) resources. This problem is worsened,
as the function signature of some layers are fixed (SAX). In effect
every layer tends to allocate a new copy of the data.


  Right, SAX is a big pain, especially since I want to keep references
to entities in attributes and SAX was not designed to allow this.

The largest gain so fair in my tests came from adding versions of 
xmlNewDocNode, xmlNewNode and xmlNewNsProp
which take ownership of the 'name' string instead of strduping it.


  Hum :-( that's a serious API change, but as separate function this
sounds fine,

Another idea is, as libxml2 is essentially acting as if strduping
is cheap, can we make it really cheap by going to reference
counted strings? I'm not clear on this, and anyway, it would have
to wait for libxml3.


  I did made an attempt at this and concluded at the time that there
was no significant gain to be made. But in the meantime I removed a number
of CPU intensive operations and this may change seriously. I also only
made those tests with linux.

b) Macros and the the ctxt->token case
I'm still not positive about the macros (but I may have
erred regarding the multiple returns). They hide their
cost from the programmer but not from the processor.
RAW and CUR are used about 350 times in parser.c
and parserinternals.c - eliminating the ctxt->token
test (and so replacing both of them with a simple
*ctxt->input->cur) was the greatest single factor for
decreasing the binary's size and the second largest
for increasing speed.


  yes but it's likely to seriously break the parser. Any such change
which wasn't tested with "make tests" holds little trust, conformance
is still my #1 goal.

So that raises the question, whether ctxt->token can
be buried as a relict from the past. As I see it, all


 I don't think so ! But you're right it might be a good idea to double
check,

nontrivial uses of it are already commented out (for
example in xmlParserHandleReference). The only
remaining assignment of other values than 0, is
the assignment of ' ' in xmlParserHandlePEReference
and xmlParsePEReference, and even there its use
is suspect if read the TODO comment right.
Can't we stuff the ' ' directly into the buffer if it is
really needed?


 Hum, maybe, yes

c) UTF8 conversion
I'm wondering whether parser.c can be changed to always working
on xmlChars and all costly conversion to 32bit UNICODE codepoints
can be avoided.


  no, you need to check the values of the codepoints for conformance.

My first impression is, that only the NameChar and
NameStartChar checking is really bothered about UNICODE codepoints
and these checks could be replaced by multi-level table lookups 
of the UTF8 bytes.


  no the restrictions of the Char production applies to the full content
of the XML document (except where stricter rules applies).

d) ctxt-sax checking
Another candidate for elimination are the about 250 cases of checking 
ctxt->sax before calling the SAX callback. I'm under the impression, that
ctxt->sax is always not zero, when I look in xmlSAXParseFile and
xmlSAXParseMemory. Also when changing the line
 ctxt->sax = sax
to
 *ctxt->sax = *sax
we are free to change the zeros in the SAX callback struc to NOP 
handlers, and the tests for the callbacks being zero can be eliminated
too. Finally, when disabling the sax callback by overwriting all
entries with NOP handlers, even the tests of ctxt->disableSAX can be
eliminated.


  Not sure you would gain much by removing those tests, and I would not feel
comfortable dereferencing the pointer without checking it first. It's just
too easy to get a new widely deployed security hole ... checks are good IMHO

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Follow-Ups:
- Re: Performance Patches (was Re: [xml] too many mallocs?)
  - From: Aleksey Sanin

References:
- Re: [xml] too many mallocs?
  - From: Peter Jacobi
- Re: [xml] too many mallocs?
  - From: Daniel Veillard
- Performance Patches (was Re: [xml] too many mallocs?)
  - From: Peter Jacobi

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]