RE: [xslt] RE: [xml] does xsltproc caches subexpressions


> -----Original Message-----
> From: Stefan Kost [mailto:ensonic hora-obscura de] 


> Am Montag, den 22.05.2006, 12:54 +0200 schrieb Buchcik, Kasimier:
> > Hi,


> > The next bottleneck in the row is the template "indexterm" (mode =
> > "reference")
> > in "autoidx.xsl":

Next results: 

1) The expression "//index[&scope;][1]" was the bottleneck of the

We made the following enhancements to the XPath module:
2) Added an XPath object cache. This will avoid massive creation/freeing
  of XPath objects. For the generation of the gst-docs 44 million
  were created; after the cache was in use this dropped to 22 million
  objects. If all slots of the cache are filled to maximum it will
  about 40 KB additional memory.

3) Enhanced xmlXPathNodeCollectAndTest(), which is the central function
  for evaluation of steps.

 a) We eliminated massive recreation of xmlNodeSet structs; this was a
  big bottleneck when traversing the descendant-or-self axis, since for
  every traversed node a new xmlNodeSet was created; just count all
  of any type in the XInclude-processed input document of gst-docs and
  you'll have the number of the structs created for every evaluation of
  this axis.

 b) The following comes from the ChangeLog, since I don't want to
  invent another explanation:
    Optimized xmlXPathNodeCollectAndTest() and
    xmlXPathNodeCollectAndTestNth() to evaluate a compound
    traversal of 2 axes when we have a "//foo" expression.
    This is done with a rewrite of the XPath AST in
    xmlXPathRewriteDOSExpression(); I added an additional field
    to xmlXPathStepOp for this (but the field's name should be
    changed). The mechanism: the embracing descendant-or-self
    axis traversal (also optimized to return only nodes which
    can hold elements), will produce context nodes for the
    inner traversal of the child axis. This way we avoid a full
    node-collecting traversal of the descendant-or-self axis.
    Some tests indicate that this can reduce execution time of
    "//foo" to 50%. Together with the XPath object cache this
    all significantly speeds up libxslt.

So the previous most-time-consuming templates were:

number       match       name            mode       Calls Tot 100us  Avg

0            indexterm                     reference 2464 13039183
1                         gentext.template          16047  2219472
2                        user.head.content             53  2216486
3                                    chunk         191008  1984551
4                    *                    recursive-chunk-filename
                                                    92686   799234

Current result:

number       match       name            mode       Calls Tot 100us  Avg

0            indexterm                     reference   2464 3425896
1                                    chunk           191008 1609874
2                         gentext.template            16047 1261323
3                               dbhtml-dir           140277  710035
4                    *                    recursive-chunk-filename
                                                      92686  703808
5                        user.head.content               53  600561

Stefan, If would be great if you could try the current CVS HEAD of
Libxml2/Libxslt for the gst-docs generation. It would be interesting
if this fixes the issue even for Ed Catmur and his sparse-memory
But you still need to customize xsltproc.c in order to activate the
object cache. It is disabled by default, so we need to make people
aware that it's there and can be activated if things run slowly.
Add a call to xmlXPathContextSetObjectCache() in xsltproc.c after
the creation of the transformation context:

ctxt = xsltNewTransformContext(cur, doc);
if (ctxt == NULL)
if (ctxt->xpathCtxt)
    xmlXPathContextSetObjectCache(ctxt->xpathCtxt, 1, -1, 0);



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]