[xml] XPath optimizations

From: Petr Pajas <pajas ufal ms mff cuni cz>
To: libxml2 <xml gnome org>
Subject: [xml] XPath optimizations
Date: Tue, 07 May 2002 19:23:23 +0200
Hi All, 

I found that libxml2 implementation is quite inefficient when computing
XPath on shallow but wide XML trees. I was trying to examine the case
of one-level document of the form <root><e/><e/>...</e></root> with
several thousand copies of element <e/>. The tested XPath expression
was count(/root/e). 

A summary of my benchmarks follows. Later, I give some ideas on how to
improve this:

4000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

8000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

16000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 7 wallclock secs

32000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took:52 wallclock secs

  <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took:292 wallclock secs
0.00/s (n=1)

As you may see, the time needed to parse the document is linear,
however time needed to evaluate the expression is worse than O(n^2).

Examining the source code with a profiler showed that the problem is
caused by xmlXPathNodeSetSort and xmlXPathCmpNodes. The complexity is
in this case actually something like O(n^2.2). 

The sort method used in xmlXPathNodeSetSort is Shell's sort which
itself is st. between O(n^1.2) and O(n^1.5) worst case. But in case of
shallow but wide (or deep but thin) XML trees the complexity of
xmlXPathCmpNodes grows to O(n), since this function compares the
position of two nodes by going through branches and levels of the
tree. So we have O(n)*O(n^1.2) which gives us O(n^2.2).

I have several ideas about how to improve this:

1) one which I've implemented and which gives very nice results is
   very simple and works (though it may seem crazy at the first
   glance). I've replaced Shell's sort with Bubble sort. As everyone
   knows its complexity is between O(n) and O(n^2), which is the worst
   case (inverted list) and of course worse than Shell's sort in
   average case. How ever, in case of presorted node-list, bubble sort
   always compares two subsequent nodes in the list in which case
   xmlXPathCmpNodes does not go far when traversing the tree and is
   O(1). (Both Shell's sort and quick-sort usually compares nodes far
   in the list, so xmlXPathCmpNodes has far journeys pretty often).
   Since in most cases, XPath results *are* actually already well
   sorted (and I can hardly imagine an expression resulting in a long
   reversely ordered node-list), bubble sort seems much better choice
   (in the example of XML document mentioned above we have
   O(n)*O(1)=O(n) instead of O(n^2.2)). My benchmarks on docbook xslt
   stylesheets and large xml documents proved that bubble sort is even
   in average a little faster than Shell's sort.

   The implementation I used:

#define XML_NODESET_USE_BUBBLE_SORT
void
xmlXPathNodeSetSort(xmlNodeSetPtr set) {
    int i, j, incr, len;
    xmlNodePtr tmp;

    if (set == NULL)
        return;

#ifdef XML_NODESET_USE_BUBBLE_SORT
    /* Use bubble sort to sort the node-set */
    len = set->nodeNr;
    j=0;
    for (i = 1; i < len; i++) {
      if (xmlXPathCmpNodes(set->nodeTab[i-1],
                           set->nodeTab[i]) == -1) {
        tmp = set->nodeTab[i-1];
        set->nodeTab[i-1] = set->nodeTab[i];
        set->nodeTab[i] = tmp;
        if (i>1) i-=2;
        j++;
      }
    }
    /* fprintf(stderr, "\nMade %d switches on %d nodes\n",j,len); */
#else /* XML_NODESET_USE_BUBBLE_SORT */
    /* Use Shell's sort to sort the node-set */
    len = set->nodeNr;
    for (incr = len / 2; incr > 0; incr /= 2) {
        for (i = incr; i < len; i++) {
            j = i - incr;
            while (j >= 0) {
                if (xmlXPathCmpNodes(set->nodeTab[j],
                                     set->nodeTab[j + incr]) == -1) {
                    tmp = set->nodeTab[j];
                    set->nodeTab[j] = set->nodeTab[j + incr];
                    set->nodeTab[j + incr] = tmp;
                    j -= incr;
                } else
                    break;
            }
        }
    }
#endif /* XML_NODESET_USE_BUBBLE_SORT */
}

   With bubble sort, benchmarks look realy nice for these flat XML
   documents (I'm not cheating:)):

4000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

8000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

16000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

32000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

64000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

   To give a smal comparison on DocBook XML stylesheets:

   With Shell's sort:
   Running stylesheet and saving result took 4600 ms
   With bubble sort:    
   Running stylesheet and saving result took 4491 ms

2) Some pre-ordering could be done before sorting a large node-set
   (say above 1000 nodes). This would be realised by traversing the
   XML tree and storing node's position somewhere in the xmlNode
   structure (probably requires a new slot to be created). Then
   xmlXPathCmpNodes would only need to compare integer numbers which
   would yield back in complexity of O(1), which gives the sort its
   original O(n^1.2). Obviously, since this requires one pass through
   the whole document, this won't be worth it for small node-sets.

What do you think?

-- Petr Pajas

XSH - http://xsh.sourceforge.net
Follow-Ups:
- Re: [xml] XPath optimizations
  - From: Gary Pennington
- Re: [xml] XPath optimizations
  - From: Bjorn Reese
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]