RE: [xslt] RE: [xml] does xsltproc caches subexpressions
- From: "Buchcik, Kasimier" <k buchcik 4commerce de>
- To: "The Gnome XSLT library mailing-list" <xslt gnome org>
- Cc: Frans Englich <frans englich telia com>
- Subject: RE: [xslt] RE: [xml] does xsltproc caches subexpressions
- Date: Mon, 22 May 2006 19:17:02 +0200
Hi,
> -----Original Message-----
> From: xslt-bounces gnome org [mailto:xslt-bounces gnome org]
> On Behalf Of Frans Englich
>
> On Monday 22 May 2006 10:54, Buchcik, Kasimier wrote:
> [...]
>
> > The tiny "[1]) = 1" is the part
> > of the expression, which e.g. Saxon can use to optimize the
> evaluation.
> > With 1732 nodes in the key, Saxon took ~12s on a test box,
> while Libxslt
> > took ~50s for evaluation of this expression. On the other hand, if I
> > changed this expression to use "count(.....) > 1" (i.e. without the
> > "[1]")
> > then Saxon needed already ages to process 300 nodes, while
> Libxslt times
> > were still linear. So Saxon is optimized, if "[1]" is used;
> > Libxslt/Libxml2
> > not.
>
> What kind of optimizations does libxml2 do on predicated, in general?
>
> For the "[1]" predicate, I think two types can be done(when
> we're talking about implementations like Saxon and libxml2):
>
> * Analysis of context dependency, and as result conclude how
> the predicate must be evaluated. For example, "1" doesn't depend on
the
> context and will therefore always evaluate to the same. However, an
expression
> involving "." would have to be evaluated each time. End point, some
> predicates only have to be evaluated once.
>
> * Early exit. For example, one knows that "1" will never
> match beyond the first item.
>
> Saxon implements both.
The following might be a bit vague, since I only scratched Libxml2's
XPath
code.
I think Libxml2 does only the "early exit" optimization; at least I
haven't
found code indicating the opposite. Although in the case of [n], it
doesn't
evaluate "n", but just triggers execution of optimized evaluation
functions
wrt the position. The 3 functions are:
1) xmlXPathCompOpEvalFirst() - used for "[1]"
2) xmlXPathCompOpEvalLast() - used for "last()"
3) xmlXPathNodeCollectAndTestNth() - used for "[n]"
But there's one catch for the "early exit": In Libxml2 each predicate is
evaluated on the node-set result of its preceding predicate; so the
"early exit" is only possible at the last level of predicates.
Example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="mykey" match="//foo" use="@val"/>
<xsl:template match="/">
<xsl:value-of select="count(key('mykey',
'a')[ boo][@zoo][1])"/>
</xsl:template>
</xsl:stylesheet>
<?xml version="1.0"?>
<foo val="a" boo="x" zoo="x">
<foo val="a" boo="x" zoo="x"/>
</foo>
Theoretically the evaluation could stop on the first "foo" element in
doc-order, but it does not: first the predicate "@boo" is applied to
both "foo" elements, which results in a node-set containing both
elements.
Then the predicate "@zoo" is applied to the node-set, and only here we
can
"exit early" on the first match. I know this goes quite hand in hand
with
what the XPath says, but I think we loose opportunities for optimization
here.
I think the result would be same same, if the predicates were evaluated
all
in a row on each node of the initial node-set. If such a mechanism would
be more complex than the one existing in an other question.
It would be interesting to hear how Saxon has implemented this.
Regards,
Kasimier
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]