[xslt] Re: '[xslt] "document() function memory usage"'


on 4/2/2004 6:28 AM William M. Brack wrote:

> Since no one else is responding to your question, I'll try.  Take my
> comments "with a grain of salt", since most of what I say is from
> reading the library sources, and I have been known to get badly
> confused ;-).
> When you use the document() function, the library has no way of
> knowing whether you will use it again, or whether this is a
> transient access.  Any time you work with a document, it has to be
> parsed, etc., which is relatively time-consuming.  In order to avoid
> the overhead of re-reading and re-parsing a document multiple times,
> the library maintains a "document list" as a part of the
> "transformation context".  The parsed image of any document which is
> accessed is added to this list, and the list (together with the
> documents it references) is only freed when the transformation
> context is freed.
> Using xsltproc, I can't think of any easy way around your problem.

Thank you very much for the information!

Since I don't know the libxslt sources well enough I can only speculate, 
if it would be possible to evaluate if a document is not needed any 
more. In the case of: <xsl:apply-templates select="document(@href)" /> I 
tend to think of a fingerprint being assigned to the "apply-templates" 
process and the documents used in it. So if the process has finished, 
all sub-documents could be released. But, wait, ahh, I see, the problem 
is not the technique of when to free the document but *performance*; so 
the fact that the processor has "no way of knowing whether you will use 
it again" (as you wrote). Hmm, it seems to me that the document() 
function is somehow missing the possibility to state if the document 
should be re-read in every case or should be persistent. Could you think 
of an option for the xslt processor, defining if anonymous document 
references (like the one in the apply-template example) should be 
handled non-persistent like? Would this be possible?



> Regards,
> Bill
> Kasimier Buchcik said:
>>on 3/24/2004 5:56 PM Kasimier Buchcik wrote:
>>>I'm trying to transform the W3C DOM-Test-Suite to Delphi. Doing
>>>this I
>>>have a stylesheet that loads about 520 XML files (2-4 KB) in a
>>>loop and
>>>processes them. Since the transformation eats all the memory on my
>>>machine (256 MB real memory, 630 MB virtual memory), the question
>>>if the documents loaded by the following construct will be
>>>released on
>>>transformation end only:
>>><xsl:for-each select="*[local-name() = 'suite.member']">
>>>   <xsl:apply-templates select="document(@href)" />
>>>(Note: the output method "text" is used; the transformation runs
>>>smoothly if only a few tests are processed)
>>>If they are released at transformation end only, could there be
>>>any way
>>>of implementing destruction of no more used documents, if
>>>xsl:apply-templates has been processed?
>>>Do some other explanations come to mind why this transformation
>>>uses so
>>>much memory?
>>Since there was no feedback on the issue, I assume some information
>>missing - I'll try once more.
>>The question is: is a document, imported with the function
>>freed after the application of <xsl:apply-templates> in the example
>>above, or is it freed when the whole transformation has ended?
>>I'm asking this, since I tried to process many sub-documents with
>>document() and my machine ran out of memory. IOW, it feels like the
>>sub-document are cumulated in memory.
>>I'm running on a w2k machine and used xsltproc for the test:
>>L:\Open_idom\XPtest\w3c>xsltproc --version
>>Using libxml 20607, libxslt 10104 and libexslt 804
>>xsltproc was compiled against libxml 20607, libxslt 10104 and
>>libexslt 804
>>libxslt 10104 was compiled against libxml 20607
>>libexslt 804 was compiled against libxml 20607
> _______________________________________________
> xslt mailing list, project page http://xmlsoft.org/XSLT/
> xslt@gnome.org
> http://mail.gnome.org/mailman/listinfo/xslt

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]