Re: [xslt] RE: [xml] does xsltproc caches subexpressions
- From: Frans Englich <frans englich telia com>
- To: "Buchcik, Kasimier" <k buchcik 4commerce de>
- Cc: The Gnome XSLT library mailing-list <xslt gnome org>
- Subject: Re: [xslt] RE: [xml] does xsltproc caches subexpressions
- Date: Wed, 24 May 2006 11:18:54 +0000
On Monday 22 May 2006 17:17, Buchcik, Kasimier wrote:
[...]
> Theoretically the evaluation could stop on the first "foo" element in
> doc-order, but it does not: first the predicate "@boo" is applied to
> both "foo" elements, which results in a node-set containing both
> elements.
> Then the predicate "@zoo" is applied to the node-set, and only here we
> can
> "exit early" on the first match. I know this goes quite hand in hand
> with
> what the XPath says, but I think we loose opportunities for optimization
> here.
>
> I think the result would be same same, if the predicates were evaluated
> all
> in a row on each node of the initial node-set. If such a mechanism would
> be more complex than the one existing in an other question.
>
> It would be interesting to hear how Saxon has implemented this.
I am not deeply familiar with Saxon, but have read parts of it, and in broad
terms it is very much similar to my own implementation[1], which in some
cases have been modelled after Saxon.
Evaluation is done by building a chain of Java-style Iterators. That is,
evaluation logic doesn't sit in the AST node and builds a list containing its
result, but an Iterator is allocated which handles requests on an
item-per-item basis. An example:
/foo/bar/[1]
Evaluating this expression starts by asking "[1]" for one item. This in turn
asks "bar" for one item, which in turn asks "foo" for one item. So, no more
"foo" elements are detected than the first one that contains a bar element.
This lazy-evaluation, chaining of pull-based iterator applies to pretty much
everything(even XQuery user defined functions when allowed), but sometimes
has to be broken due to sorting, for example. Here's another example:
XPath 2.0 has the "range" expression which allows generation of sequences of
integers. For example, "1 to 4" generates the sequence (1, 2, 3, 4). The
implementation consists of two parts, an AST node(called RangeExpression) and
an iterator(RangeIterator). When evaluating the expression (1 to 10000)
[last()], the RangeExpression returns an RangeIterator which subsequently is
asked to return the last item. The end point is that it evaluates in constant
time. (Not something one would encounter in the wild that often, but it
demonstrates the point.)
This is not an exact description of Saxon nor my implementation, but the
priniple is in the right direction.
Cheers,
Frans
1.
It has not yet been released, but is becoming an XQuery 1.0/XPath 2.0/XSL-T
2.0 implementation for KDE. Keywords in a nutshell is probably "statically
typed", "pull-based data handling". Tokenizer is handwritten, parser Bison
based. Source is here:
svn://anonsvn.kde.org/home/kde/trunk/kdenonbeta/kdom/xpath
To browse the documentation, run `doxygen` and browse xpath/html/index.html.
The classes "Expression" and "Iterator" are central.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]