[xslt] Performance issue when matching text()|*



We encountered a strange performance problem related
to the cost of building large nodesets.  In
particular, it seems that the cost of building a
nodeset is not linear in the size.  To demonstrate
this we created three XML files:

<?xml version="1.0" ?>
<topnode>
<inner-node/>
<inner-node/>
 ...
</topnode>

The first had 1,000 of the inner nodes (and the
corresponding newlines), the second 3,000 and the last
10,000.  In the tests, we use the following
stylesheet:

<?xml version="1.0" ?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="1.0"
>

<xsl:template match="topnode">
  <xsl:call-template name="nodes-only"/>
  <xsl:call-template name="text-only"/>
  <xsl:call-template name="nodes-or-text"/>
</xsl:template>

<xsl:template name="nodes-only">
  <xsl:value-of select="count(*)"/>
  <xsl:value-of select="'&#10;'"/>
</xsl:template>

<xsl:template name="text-only">
  <xsl:value-of select="count(text())"/>
  <xsl:value-of select="'&#10;'"/>
</xsl:template>

<xsl:template name="nodes-or-text">
  <xsl:value-of select="count(*|text())"/>
  <xsl:value-of select="'&#10;'"/>
</xsl:template>

</xsl:stylesheet>

>From the profiling using version 1.1.12, we see the
following times:

Template Name      1000 nodes    3000 nodes    10,000
nodes
nodes-or-text        4890          77834         
1870993
text-only            1274          12357          
213519
nodes-only             27            257            
1526

which corresponds to the following speed ratios

Template Name      1000 nodes    3000 nodes    10,000
nodes
nodes-or-text        1           15.9 times     382.6
times
text-only            1            9.7 times     167.6
times
nodes-only           1            9.5 times     56.5
times

This is definitely not scaling linearly.  And, a
second interesting point, is that count *|text() is 5
times slower than counting * and then counting text()
separately.

Regards,
Jerome Pesenti


		
__________________________________ 
Do you Yahoo!? 
Read only the mail you want - Yahoo! Mail SpamGuard. 
http://promotions.yahoo.com/new_mail 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]