Re: [xslt] xsltproc memory consumption w/ large DTD / docbook
- From: cbozeman hiwaay net
- To: xslt gnome org
- Subject: Re: [xslt] xsltproc memory consumption w/ large DTD / docbook
- Date: Tue, 28 Jan 2003 15:07:21 -0600 (CST)
Try using the --novalid option on xsltproc; the man page states that this
option should prevent the DTD loading.
Quoting Michael Weiser <email@example.com>:
> Hi all,
> I've got a usage problem with libxml2/libxslt/xsltproc. I want to use
> Norman Walsh's Website DTD which in turn is based on his
> DocBook-XSL-Stylesheets and the DocBook-XML-DTD. So far xsltproc
> just great for that task in combinations of current cvs of libxml2 and
> libxslt as well as 2.4.16/1.0.22 on Linux 2.4.20 with glibc 2.1.3
> Linux 6.2) as well as 2.2.93 (RH 8.0) with latest patches installed.
> The problem is that xsltproc consumes huge amounts of memory,
> mainly on storing the DTDs. I know that such a huge XSL/DTD framework
> DocBook is a really bad testcase and I'm prepared to dig into it
> But before I do so I'd just like to know whether that's perhaps
> normal for xsltproc.
> As for my setup I've got about 300 source xml files of doctype Website
> about 800 bytes each. They get loaded via document()-xpath-selects by
> self-contained XSL stylesheet (autolayout.xsl). After about 150 of
> process size of xsltproc reaches 900MB, swap runs out and the Linux
> kills the process.
> So far I've tried making sure that it isn't a memory leak using dmalloc
> both of my libxml2/libxslt installations. While the release versions
> completely clean the CVS snapshots forget to free a few bytes over all
> (less than 100 in 5 blocks or so) which doesn't seem that critical to
> I've also tried narrowing down the source of that enormous memory
> consumption. I had a look at the stylesheet and the only extraordinary
> thing it does is loading those small external documents via
> Otherwise it's completely self-contained, meaning it isn't
> the DocBook-XSL-Stylesheets at all because they haven't come into play
> at that stage of processing. It can't be the document size either
> they're only about 800 bytes each.
> But I thought that it might be the DTD size because Website is based
> DocBook and that DTD is quite large. Therefore I tried processing just
> document and dmalloc reported a overall memory consumption of 33MB.
> After adding a second one of those 800 byte documents it went up to
> like 66.
> Therefore a wild guess: Does xsltproc load, parse and *store* the DTD
> every document it includes via document()? If so: Is there a way to
> it stop doing so or reuse the already loaded DTD's in memory?
> BTW: For now I do that particular processing stage using saxon-7.3.1
> does it consuming about 50MB of memory. xsltproc then transforms the
> individual pages (800 bytes XML) consuming about 30-40MB of RAM in the
> process which seems to be the DTD as well as the DocBook-XSL
> Thanks in advance for your help.
> xslt mailing list, project page http://xmlsoft.org/XSLT/
] [Thread Prev