[xslt] xsl:key degenerate performance
- From: "Chris Trengove" <trengove econdata com au>
- To: xslt gnome org
- Subject: [xslt] xsl:key degenerate performance
- Date: Tue, 03 Feb 2004 13:19:49 +1100
I have come across a performance issue with the implementation of xsl:key when
dealing with a large number of nodes. To demonstrate this, if one processes
documents like
<KeyTest>
<Cell><Value>0</Value></Cell>
<Cell><Value>1</Value></Cell>
<Cell><Value>2</Value></Cell>
<!--Lots more similar lines follow-->
</KeyTest>
using
<xsl:key name="cells-key" match="/KeyTest/Cell" use="Value"/>
the processing time seems to grow exponentially/geometrically as the number of
nodes grows. For 1000 nodes, the processing is virtually instantaneous, for 10000
nodes, it takes about 6 seconds, while for 100000 nodes processing time is in the
order of 1 hour! (This is with libxslt-1.1.2.)
I have been able to check what is happening in a profiler, and it turns that virtually
all of this time is spent in the single call to xmlXPathNodeSetSort() which occurs as
part of the compilation of the "match" expression, the idea being to put the resulting
node set into document order. For example, for the 10000 node case, this sort
operation takes all but about 0.1 second of the total processing time. This is
particularly unfortunate in this case, since the nodes are already sorted, and the
whole operation is unnecessary!
The next question is what to do about this. Some possibilities might be:
(a) Don't sort the result of the xsl:key "match" expression. From my reading of the
XSLT standard, it does not suggest that the result returned by the key() function
should be in document order (or any particular order).
(b) Since implementing (a) might break some existing stylesheets which relied on
the document order assumption, introduce some global setting to force sorting on or
off.
(c) Try and be a bit smarter in detecting when it is actually necessary to sort, since
in almost all cases, the nodes will be in document order in any case.
It would be really nice to fix this, since the other parts of the xsl:key, key()
implementation appear to be really fast, and the only thing holding back the
handling of larger documents is the unnecessary sort operation.
Chris Trengove
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]