Re: [xslt] Probable memory leak (when using document()?)



On Fri, Jan 28, 2005 at 05:56:55PM +0100, Vincent Lefevre wrote:
> I reported a bug on the Debian BTS:
> 
>   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=287371
> 
> As I said:
> 
> Here xsltproc takes up to 138 MB, making the whole system slow down
> due to swapping. This problem occurs when generating my blog page,
> where a document() is used for each blog item (this will change in
> the future, but the current behavior shouldn't occur). The sources
> are in a DocBook-based DTD that can be downloaded from
> 
>   http://www.vinc17.org/DTD/website.dtd
> 
> I'm not including the XML sources since this is quite complicated
> (lots of inclusions and dependencies). But if the bug is not known,
> I could try to build a simpler example.
> 
> ----
> 
> I tried with libxml2 2.6.16 and libxslt 1.1.8.
> 
> Should I report the bug to the GNOME BTS?
> Is it a bug in libxml2 or libxslt?

  Doesn't look like a bug to me. You're calling document() a lot of time.
The documents trees must be kept to ensure that the semantic of document()
can be maintained
   http://www.w3.org/TR/xslt#document
i.e. that
   generate-id(document("foo.xml"))=generate-id(document("foo.xml"))

  Since libxslt generate-id() is based on node pointer values.
More over each document reference the DocBook DTD which is far from
being small. So it's not surprizing at all to me if memory grows very fast.
The only doubt I have is that the DTD from those documents can probably
be removed once parsed since all entities values should have been replaced
at that point but that's not 100% sure it's a safe thing to do.
  Using a DTD which uses 2.5 MByte of memory for each blog items which
should be around a kilobyte each sounds a very heavy design to me. You're
paying the cost of that design I would say.
  Doesn't sound as a libxslt bug to me.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]