[xslt] Profiling and possible speed improvements
- From: frodol dds nl
- To: xml gnome org, xslt gnome org
- Subject: [xslt] Profiling and possible speed improvements
- Date: Tue, 23 Apr 2002 20:41:10 +0200 (CEST)
Hi folks,
My xslt stylesheet took a bit long to apply (many hours, but amongst other
things it creates an index of my document, which is probably a bit of a
stretch to do with xslt), so I profiled it with gprof to see where it
spent its time. It seems that most time is consumed within xmlXPathCmpNodes,
which in turn is called from xmlXPathNodeSetSort. As this uses a Shell
sort algorithm, and it seems that I am handling somewhat large nodesets,
I wondered whether it would help to use another sorting algorithm (with
a little luck, that would reduce the number of node comparisons needed).
To keep it simple, I tried the qsort algorithm of glibc-2.2.5, which
indeed uses a quick-sort. This seemed to help quite a lot.
I have attached the diff I used to illustrate the changes needed.
Basically, I created xmlXPathCmpNodes_qsort that does the same job
as xmlXPathCmpNodes, except that it conforms to qsort semantics
(don't return -2, -1 means less than, use pointers to elements).
Don't apply this blindly, it needs some more work.
Thanks for the good work,
Frodo
--
Frodo Looijaard <frodol@dds.nl> PGP key and more: http://huizen.dds.nl/~frodol
Defenestration n. (formal or joc.):
The act of removing Windows from your computer in disgust, usually followed
by the installation of Linux or some other Unix-like operating system.
*** /tmp/libxml2-2.4.20/xpath.c Thu Apr 11 04:35:16 2002
--- libxml2-2.4.20/xpath.c Tue Apr 23 20:36:22 2002
***************
*** 1410,1413 ****
--- 1410,1491 ----
/**
+ * xmlXPathCmpNodes:
+ * @node1: the first node
+ * @node2: the second node
+ *
+ * Compare two nodes w.r.t document order
+ *
+ * Returns 0 in case of error -1 if first point < second point, 0 if
+ * that's the same node, 1 otherwise
+ */
+ int
+ xmlXPathCmpNodes_qsort(xmlNodePtr *node1_ptr, xmlNodePtr *node2_ptr) {
+ int depth1, depth2;
+ xmlNodePtr cur, root,node1=*node1_ptr,node2=*node2_ptr;
+
+ if ((node1 == NULL) || (node2 == NULL))
+ /*
+ * a couple of optimizations which will avoid computations in most cases
+ */
+ if (node1 == node2)
+ return(0);
+ if ((node1->type == XML_NAMESPACE_DECL) ||
+ (node2->type == XML_NAMESPACE_DECL))
+ return(-1);
+ if (node1 == node2->prev)
+ return(-1);
+ if (node1 == node2->next)
+ return(1);
+
+ /*
+ * compute depth to root
+ */
+ for (depth2 = 0, cur = node2;cur->parent != NULL;cur = cur->parent) {
+ if (cur == node1)
+ return(-1);
+ depth2++;
+ }
+ root = cur;
+ for (depth1 = 0, cur = node1;cur->parent != NULL;cur = cur->parent) {
+ if (cur == node2)
+ return(1);
+ depth1++;
+ }
+ /*
+ * Distinct document (or distinct entities :-( ) case.
+ */
+ if (root != cur) {
+ return(0);
+ }
+ /*
+ * get the nearest common ancestor.
+ */
+ while (depth1 > depth2) {
+ depth1--;
+ node1 = node1->parent;
+ }
+ while (depth2 > depth1) {
+ depth2--;
+ node2 = node2->parent;
+ }
+ while (node1->parent != node2->parent) {
+ node1 = node1->parent;
+ node2 = node2->parent;
+ /* should not happen but just in case ... */
+ if ((node1 == NULL) || (node2 == NULL))
+ return(0);
+ }
+ /*
+ * Find who's first.
+ */
+ if (node1 == node2->next)
+ return(1);
+ for (cur = node1->next;cur != NULL;cur = cur->next)
+ if (cur == node2)
+ return(-1);
+ return(1); /* assume there is no sibling list corruption */
+ }
+
+ /**
* xmlXPathNodeSetSort:
* @set: the node set
***************
*** 1423,1426 ****
--- 1501,1505 ----
return;
+ #if 0
/* Use Shell's sort to sort the node-set */
len = set->nodeNr;
***************
*** 1440,1443 ****
--- 1519,1526 ----
}
}
+ #else
+ /* Try QuickSort here */
+ qsort(set->nodeTab,set->nodeNr,sizeof(set->nodeTab[0]),xmlXPathCmpNodes_qsort);
+ #endif
}
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]