Re: [xml] [Patch] Optimizing '//' in XPath expressions



On Fri, Aug 24, 2012 at 12:47:16AM -0400, Liam R E Quin wrote:
On Fri, 2012-08-24 at 12:21 +0800, Daniel Veillard wrote:
[...]
I suspect it's just the top of the iceberg, there is a number of other
post-compilation optimization which can certainly be made, but with
less drastic improvements.

Mike Kay has spoken at I think XML Prague and/or Balisage about the
optimizations in Saxon; from my imperfect memory :) they include
building an element index during parsing, expression rewriting, and
using a bytecode interpreter.

  I still have this in my bookmarks :-)
  http://www.ibm.com/developerworks/library/x-xslt2/?dwzone=x?open&l=132%2ct=gr%2c+p=saxon

There have been papers on XPath optimization in the context of XQuery,
some of which may also apply (e.g. at VLDB).

I think a lot of stylesheet writers have learned to avoid //x even in
implementations where it's basically O(1) these days. But it's still
worth speeding up :-)

  One of the key differences is that XPath is inplemented in libxml2
and XSLT in libxslt, a different library. And some of the optimization
would not work that easilly, for example libxml2 allows to use XPath
on subtrees for example in an xmlReader expanded context, I can't assume
full indexing of the document. The key point is that libxslt make use
of an ordered numbering of the document which can then be reused by
XPath in libxml2 to compare nodes for example, but the XPath code in
libxml2 cannot assume the availability of that numbering. That's one
example of things made slightly harder due to the split in different
components.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]