[xslt] Performance issue when matching text()|*
- From: Jerome Pesenti <jpesenti yahoo com>
- To: xslt gnome org
- Subject: [xslt] Performance issue when matching text()|*
- Date: Fri, 28 Jan 2005 08:12:00 -0800 (PST)
We encountered a strange performance problem related
to the cost of building large nodesets. In
particular, it seems that the cost of building a
nodeset is not linear in the size. To demonstrate
this we created three XML files:
<?xml version="1.0" ?>
<topnode>
<inner-node/>
<inner-node/>
...
</topnode>
The first had 1,000 of the inner nodes (and the
corresponding newlines), the second 3,000 and the last
10,000. In the tests, we use the following
stylesheet:
<?xml version="1.0" ?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
>
<xsl:template match="topnode">
<xsl:call-template name="nodes-only"/>
<xsl:call-template name="text-only"/>
<xsl:call-template name="nodes-or-text"/>
</xsl:template>
<xsl:template name="nodes-only">
<xsl:value-of select="count(*)"/>
<xsl:value-of select="' '"/>
</xsl:template>
<xsl:template name="text-only">
<xsl:value-of select="count(text())"/>
<xsl:value-of select="' '"/>
</xsl:template>
<xsl:template name="nodes-or-text">
<xsl:value-of select="count(*|text())"/>
<xsl:value-of select="' '"/>
</xsl:template>
</xsl:stylesheet>
>From the profiling using version 1.1.12, we see the
following times:
Template Name 1000 nodes 3000 nodes 10,000
nodes
nodes-or-text 4890 77834
1870993
text-only 1274 12357
213519
nodes-only 27 257
1526
which corresponds to the following speed ratios
Template Name 1000 nodes 3000 nodes 10,000
nodes
nodes-or-text 1 15.9 times 382.6
times
text-only 1 9.7 times 167.6
times
nodes-only 1 9.5 times 56.5
times
This is definitely not scaling linearly. And, a
second interesting point, is that count *|text() is 5
times slower than counting * and then counting text()
separately.
Regards,
Jerome Pesenti
__________________________________
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]