AW: AW: [xml] extremely long xslt transformation times



  2 sounds impossible technically, I don't want to change 
the format
of the in-memory representation. 1 and 3 are okay at an 
architectural
level but I don't see how to do this easilly in practice. 
But if you
know how to do this, feel free to work on it, I take patches !


Okay, so if the UNION operator sorts the result set that top level 
sort is not needed, right?

  no, precisely because the UNION does not sort.

What about COLLECTS, they deliver sorted results as well? 
If so, maybe we
can optimize more?

  They do only if you have a single initial node. If you have
a set of nodes as input you get an aggregation of sorted sublists.
And the sort is always dependant on the axis some are forward,
others backward (and some return at most an unique value).

  If it was simple I would have done that optimization, some 
optimizations
are certainly possible, but I didn't try to work in the sorting area.


After spending about two days investigating the existing source code
and reading some articles about other implementations I agree with you
that this is not a simple matter and nothing is implemented quickly.

The only thing I could imagine for the short term re improving sort 
performance would be to give each node in the source XML an unique
sequence number, which represents the document order. This number
would only be computed for the XSLT usage of the DOM tree and should
be in parallel to the xmlNode structures, maybe hanging of the 
document. I could imaging a hash table that maps xmlNodePtr to the
sequence number even though this table could become quite large.

Could you think of other approaches to improve the sorting performance?

As for optimizing the compiled XPath and optimizing the sort away I have
given up for the moment but I want to come and visit it again. With
regard to sorting: Is there a definition about sort order if the
document function is used? Example:

<DOCLIST>
        <DOC>a.xml</DOC>
        <DOC>b.xml</DOC>
        <DOC>c.xml</DOC>
</DOCLIST>

<xsl:style-sheet ...>
        <xsl:template match="/">
                <xsl:for-each select="document(/DOCLIST/DOC)/*">
                        ... what is the document order here???
                </xsl:for-each>
        </xsl:template>
</xsl:style-sheet>

Thanks
Thomas



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]