Re: [xslt] xsltproc memory consumption w/ large DTD / docbook
- From: cbozeman hiwaay net
- To: xslt gnome org
- Subject: Re: [xslt] xsltproc memory consumption w/ large DTD / docbook
- Date: Tue, 28 Jan 2003 15:07:21 -0600 (CST)
Try using the --novalid option on xsltproc; the man page states that this
option should prevent the DTD loading.
Charlie B.
Quoting Michael Weiser <mweiser@fachschaft.imn.htwk-leipzig.de>:
> Hi all,
>
> I've got a usage problem with libxml2/libxslt/xsltproc. I want to use
> Norman Walsh's Website DTD which in turn is based on his
> DocBook-XSL-Stylesheets and the DocBook-XML-DTD. So far xsltproc
> worked
> just great for that task in combinations of current cvs of libxml2 and
> libxslt as well as 2.4.16/1.0.22 on Linux 2.4.20 with glibc 2.1.3
> (RedHat
> Linux 6.2) as well as 2.2.93 (RH 8.0) with latest patches installed.
>
> The problem is that xsltproc consumes huge amounts of memory,
> seemingly
> mainly on storing the DTDs. I know that such a huge XSL/DTD framework
> as
> DocBook is a really bad testcase and I'm prepared to dig into it
> further.
> But before I do so I'd just like to know whether that's perhaps
> considered
> normal for xsltproc.
>
> As for my setup I've got about 300 source xml files of doctype Website
> about 800 bytes each. They get loaded via document()-xpath-selects by
> a
> self-contained XSL stylesheet (autolayout.xsl). After about 150 of
> them
> process size of xsltproc reaches 900MB, swap runs out and the Linux
> OOM
> kills the process.
>
> So far I've tried making sure that it isn't a memory leak using dmalloc
> on
> both of my libxml2/libxslt installations. While the release versions
> are
> completely clean the CVS snapshots forget to free a few bytes over all
> (less than 100 in 5 blocks or so) which doesn't seem that critical to
> me.
>
> I've also tried narrowing down the source of that enormous memory
> consumption. I had a look at the stylesheet and the only extraordinary
> thing it does is loading those small external documents via
> document().
> Otherwise it's completely self-contained, meaning it isn't
> including/using
> the DocBook-XSL-Stylesheets at all because they haven't come into play
> yet
> at that stage of processing. It can't be the document size either
> because
> they're only about 800 bytes each.
>
> But I thought that it might be the DTD size because Website is based
> on
> DocBook and that DTD is quite large. Therefore I tried processing just
> one
> document and dmalloc reported a overall memory consumption of 33MB.
> After adding a second one of those 800 byte documents it went up to
> just
> like 66.
>
> Therefore a wild guess: Does xsltproc load, parse and *store* the DTD
> of
> every document it includes via document()? If so: Is there a way to
> make
> it stop doing so or reuse the already loaded DTD's in memory?
>
> BTW: For now I do that particular processing stage using saxon-7.3.1
> which
> does it consuming about 50MB of memory. xsltproc then transforms the
> individual pages (800 bytes XML) consuming about 30-40MB of RAM in the
> process which seems to be the DTD as well as the DocBook-XSL
> stylesheets.
>
> Thanks in advance for your help.
> --
> Micha
> _______________________________________________
> xslt mailing list, project page http://xmlsoft.org/XSLT/
> xslt@gnome.org
> http://mail.gnome.org/mailman/listinfo/xslt
>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]