Re: [xml] XPath optimizations
- From: Gary Pennington <Gary Pennington sun com>
- To: Petr Pajas <pajas ufal ms mff cuni cz>
- Cc: libxml2 <xml gnome org>
- Subject: Re: [xml] XPath optimizations
- Date: Thu, 09 May 2002 17:49:41 +0100
Petr Pajas wrote:
Hi All,
I found that libxml2 implementation is quite inefficient when computing
XPath on shallow but wide XML trees. I was trying to examine the case
of one-level document of the form <root><e/><e/>...</e></root> with
several thousand copies of element <e/>. The tested XPath expression
was count(/root/e).
A summary of my benchmarks follows. Later, I give some ideas on how to
improve this:
4000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs
8000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs
16000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 7 wallclock secs
32000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took:52 wallclock secs
<e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took:292 wallclock secs
0.00/s (n=1)
As you may see, the time needed to parse the document is linear,
however time needed to evaluate the expression is worse than O(n^2).
Examining the source code with a profiler showed that the problem is
caused by xmlXPathNodeSetSort and xmlXPathCmpNodes. The complexity is
in this case actually something like O(n^2.2).
The sort method used in xmlXPathNodeSetSort is Shell's sort which
itself is st. between O(n^1.2) and O(n^1.5) worst case. But in case of
shallow but wide (or deep but thin) XML trees the complexity of
xmlXPathCmpNodes grows to O(n), since this function compares the
position of two nodes by going through branches and levels of the
tree. So we have O(n)*O(n^1.2) which gives us O(n^2.2).
I have several ideas about how to improve this:
1) one which I've implemented and which gives very nice results is
very simple and works (though it may seem crazy at the first
glance). I've replaced Shell's sort with Bubble sort. As everyone
knows its complexity is between O(n) and O(n^2), which is the worst
case (inverted list) and of course worse than Shell's sort in
average case. How ever, in case of presorted node-list, bubble sort
always compares two subsequent nodes in the list in which case
xmlXPathCmpNodes does not go far when traversing the tree and is
O(1). (Both Shell's sort and quick-sort usually compares nodes far
in the list, so xmlXPathCmpNodes has far journeys pretty often).
Since in most cases, XPath results *are* actually already well
sorted (and I can hardly imagine an expression resulting in a long
reversely ordered node-list), bubble sort seems much better choice
(in the example of XML document mentioned above we have
O(n)*O(1)=O(n) instead of O(n^2.2)). My benchmarks on docbook xslt
stylesheets and large xml documents proved that bubble sort is even
in average a little faster than Shell's sort.
The implementation I used:
#define XML_NODESET_USE_BUBBLE_SORT
void
xmlXPathNodeSetSort(xmlNodeSetPtr set) {
int i, j, incr, len;
xmlNodePtr tmp;
if (set == NULL)
return;
#ifdef XML_NODESET_USE_BUBBLE_SORT
/* Use bubble sort to sort the node-set */
len = set->nodeNr;
j=0;
for (i = 1; i < len; i++) {
if (xmlXPathCmpNodes(set->nodeTab[i-1],
set->nodeTab[i]) == -1) {
tmp = set->nodeTab[i-1];
set->nodeTab[i-1] = set->nodeTab[i];
set->nodeTab[i] = tmp;
if (i>1) i-=2;
j++;
}
}
/* fprintf(stderr, "\nMade %d switches on %d nodes\n",j,len); */
#else /* XML_NODESET_USE_BUBBLE_SORT */
/* Use Shell's sort to sort the node-set */
len = set->nodeNr;
for (incr = len / 2; incr > 0; incr /= 2) {
for (i = incr; i < len; i++) {
j = i - incr;
while (j >= 0) {
if (xmlXPathCmpNodes(set->nodeTab[j],
set->nodeTab[j + incr]) == -1) {
tmp = set->nodeTab[j];
set->nodeTab[j] = set->nodeTab[j + incr];
set->nodeTab[j + incr] = tmp;
j -= incr;
} else
break;
}
}
}
#endif /* XML_NODESET_USE_BUBBLE_SORT */
}
With bubble sort, benchmarks look realy nice for these flat XML
documents (I'm not cheating:)):
4000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs
8000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs
16000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs
32000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs
64000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs
To give a smal comparison on DocBook XML stylesheets:
With Shell's sort:
Running stylesheet and saving result took 4600 ms
With bubble sort:
Running stylesheet and saving result took 4491 ms
2) Some pre-ordering could be done before sorting a large node-set
(say above 1000 nodes). This would be realised by traversing the
XML tree and storing node's position somewhere in the xmlNode
structure (probably requires a new slot to be created). Then
xmlXPathCmpNodes would only need to compare integer numbers which
would yield back in complexity of O(1), which gives the sort its
original O(n^1.2). Obviously, since this requires one pass through
the whole document, this won't be worth it for small node-sets.
What do you think?
-- Petr Pajas
XSH - http://xsh.sourceforge.net
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml
Hi,
There was a discussion on this subject last month. See the mail archives
and search for "Profiling and possible speed improvements". There was
some interesting background on why shell sort is used and what happens
if you try to replace it with qsort.
I think that your suggestions have merit (especially 2), however it's
going to take some time to work out the best way to address this problem.
I'm spending a little time looking through libxml2 for optimization
opportunities. If anyone has any suggestions for areas which they think
should be considered, send them through to the list and I'll add them to
my list of things to examine: which so far consists of:
1. Buffer and memory management in general.
2. XPath node sorting.
3. All TODO statements in the comments
Gary
PS Don't expect any immediate results, this is a low priority thread and
I don't expect I'll have more than an hour or so a week to do this work.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]