Re: [xml] XPath optimizations



Petr Pajas wrote:

Hi All,
I found that libxml2 implementation is quite inefficient when computing
XPath on shallow but wide XML trees. I was trying to examine the case
of one-level document of the form <root><e/><e/>...</e></root> with
several thousand copies of element <e/>. The tested XPath expression
was count(/root/e).
A summary of my benchmarks follows. Later, I give some ideas on how to
improve this:

4000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

8000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

16000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 7 wallclock secs

32000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took:52 wallclock secs

 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took:292 wallclock secs
0.00/s (n=1)

As you may see, the time needed to parse the document is linear,
however time needed to evaluate the expression is worse than O(n^2).

Examining the source code with a profiler showed that the problem is
caused by xmlXPathNodeSetSort and xmlXPathCmpNodes. The complexity is
in this case actually something like O(n^2.2).
The sort method used in xmlXPathNodeSetSort is Shell's sort which
itself is st. between O(n^1.2) and O(n^1.5) worst case. But in case of
shallow but wide (or deep but thin) XML trees the complexity of
xmlXPathCmpNodes grows to O(n), since this function compares the
position of two nodes by going through branches and levels of the
tree. So we have O(n)*O(n^1.2) which gives us O(n^2.2).

I have several ideas about how to improve this:

1) one which I've implemented and which gives very nice results is
  very simple and works (though it may seem crazy at the first
  glance). I've replaced Shell's sort with Bubble sort. As everyone
  knows its complexity is between O(n) and O(n^2), which is the worst
  case (inverted list) and of course worse than Shell's sort in
  average case. How ever, in case of presorted node-list, bubble sort
  always compares two subsequent nodes in the list in which case
  xmlXPathCmpNodes does not go far when traversing the tree and is
  O(1). (Both Shell's sort and quick-sort usually compares nodes far
  in the list, so xmlXPathCmpNodes has far journeys pretty often).
  Since in most cases, XPath results *are* actually already well
  sorted (and I can hardly imagine an expression resulting in a long
  reversely ordered node-list), bubble sort seems much better choice
  (in the example of XML document mentioned above we have
  O(n)*O(1)=O(n) instead of O(n^2.2)). My benchmarks on docbook xslt
  stylesheets and large xml documents proved that bubble sort is even
  in average a little faster than Shell's sort.

  The implementation I used:

#define XML_NODESET_USE_BUBBLE_SORT
void
xmlXPathNodeSetSort(xmlNodeSetPtr set) {
   int i, j, incr, len;
   xmlNodePtr tmp;

   if (set == NULL)
        return;

#ifdef XML_NODESET_USE_BUBBLE_SORT
   /* Use bubble sort to sort the node-set */
   len = set->nodeNr;
   j=0;
   for (i = 1; i < len; i++) {
     if (xmlXPathCmpNodes(set->nodeTab[i-1],
                           set->nodeTab[i]) == -1) {
        tmp = set->nodeTab[i-1];
        set->nodeTab[i-1] = set->nodeTab[i];
        set->nodeTab[i] = tmp;
        if (i>1) i-=2;
        j++;
     }
   }
   /* fprintf(stderr, "\nMade %d switches on %d nodes\n",j,len); */
#else /* XML_NODESET_USE_BUBBLE_SORT */
   /* Use Shell's sort to sort the node-set */
   len = set->nodeNr;
   for (incr = len / 2; incr > 0; incr /= 2) {
        for (i = incr; i < len; i++) {
            j = i - incr;
            while (j >= 0) {
                if (xmlXPathCmpNodes(set->nodeTab[j],
                                     set->nodeTab[j + incr]) == -1) {
                    tmp = set->nodeTab[j];
                    set->nodeTab[j] = set->nodeTab[j + incr];
                    set->nodeTab[j + incr] = tmp;
                    j -= incr;
                } else
                    break;
            }
        }
   }
#endif /* XML_NODESET_USE_BUBBLE_SORT */
}

  With bubble sort, benchmarks look realy nice for these flat XML
  documents (I'm not cheating:)):

4000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

8000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

16000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

32000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

64000 <e/> elements
parse took: 0 wallclock secs
xpath count(/root/e) took: 0 wallclock secs

  To give a smal comparison on DocBook XML stylesheets:

  With Shell's sort:
  Running stylesheet and saving result took 4600 ms
With bubble sort: Running stylesheet and saving result took 4491 ms

2) Some pre-ordering could be done before sorting a large node-set
  (say above 1000 nodes). This would be realised by traversing the
  XML tree and storing node's position somewhere in the xmlNode
  structure (probably requires a new slot to be created). Then
  xmlXPathCmpNodes would only need to compare integer numbers which
  would yield back in complexity of O(1), which gives the sort its
  original O(n^1.2). Obviously, since this requires one pass through
  the whole document, this won't be worth it for small node-sets.

What do you think?

-- Petr Pajas

XSH - http://xsh.sourceforge.net
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml
Hi,

There was a discussion on this subject last month. See the mail archives and search for "Profiling and possible speed improvements". There was some interesting background on why shell sort is used and what happens if you try to replace it with qsort.

I think that your suggestions have merit (especially 2), however it's going to take some time to work out the best way to address this problem.

I'm spending a little time looking through libxml2 for optimization opportunities. If anyone has any suggestions for areas which they think should be considered, send them through to the list and I'll add them to my list of things to examine: which so far consists of:

1. Buffer and memory management in general.
2. XPath node sorting.
3. All TODO statements in the comments

Gary

PS Don't expect any immediate results, this is a low priority thread and I don't expect I'll have more than an hour or so a week to do this work.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]