Re: [xml] Deep-copy of xsltStylesheetPtr?



On Sat, 18 Sep 2004, Rasmus Lerdorf wrote:
On Sat, 18 Sep 2004, Daniel Veillard wrote:
  Another thing which should not be underestimated is that most
useful stylesheets are compound, made of multiple XML via xsl:include
or xsl:import, and those are loaded on-the-fly when compiling the
stylesheet, they would escape your caching scheme, unless you override
the loading functions (at the libxml2 level or at the xsltSetLoaderFunc
from recent libxslt/documents.h).

Good point.  I should be able to overload that.  I'll get the simpler case
working first and run some benchmarks to see what this actually gets me.

Well, after benchmarking, it turns out that xmlCopyDoc() is actually the
bottleneck in this scheme.  It is way slower to call xmlCopyDoc() to copy
the xmlDoc from shared memory to request memory than it is to simply
reparse the XML document from disk.

Using the standard Belgian waffles menu xml and xsl data if I do a normal:

  <?php
    $xml = domxml_open_file('data.xml');
    $xsl = domxml_xslt_stylesheet_file('data.xsl');
    $out = $xsl->process($xml);
    echo $out->dump_mem();
  ?>

I get around 274 requests/second from this rather old FreeBSD4 dev box I
am testing on.

If I cache the domdocs in shared memory and copy them out using
xmlCopyDoc() with this:

  <?php
     $xmldoc = apc_fetch('xml');
     $xsldoc = apc_fetch('xsl');
     $xsl = domxml_xslt_stylesheet_doc($xsldoc);
     $out = $xsl->process($xmldoc);
     echo $out->dump_mem();
  ?>

I drop to 126 requests/sec.

If I skip the XSLT step and implement my own substitutions directly in PHP
like this: (left header and footer html off to shorten this a bit)

<?php
$dom = domxml_open_file('data.xml');
$root = $dom->document_element();
$node = $root->first_child();
$i=0;
while($node) {
    $subnode = $node->first_child();
    while($subnode) {
        if($subnode->tagname) {
            $menu[$i][$subnode->tagname] = $subnode->get_content();
        }
        $subnode = $subnode->next_sibling();
    }
    $i++;
    $node = $node->next_sibling();
}
foreach($menu as $item) {
echo <<<EOT
<div style="background-color:teal;color:white;padding:4px">
  <span style="font-weight:bold;color:white">$item[name]</span>
  - $item[price]
</div>
<div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
  $item[description]
  <span style="font-style:italic">($item[calories] calories per
serving)</span>
</div>
EOT;
}?>

This runs at around 267 requests/second which is close to the 274 that the
XSLT processor did so PHP and libxslt seem to parse and substitute at
about the same rate.  Now, if I read the xmlDoc dom struct out of shared
memory in the above example I drop to 157 req/sec.

The only way to get significant speedup is to cache the final $menu array
that my simple little parser there creates.  If I grab $menu from shared
memory then I jump to 600 req/sec.  A significant speedup.

The conclusion to all this being that xmlCopyDoc() is just too damn slow
for me to do it this way.  I either have to work directly from shared
memory or cache beyond the xmlDoc stage like I did here.

-Rasmus



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]