Re: [xslt] xsltproc memory consumption w/ large DTD / docbook
- From: Daniel Veillard <veillard redhat com>
- To: Michael Weiser <mweiser fachschaft imn htwk-leipzig de>
- Cc: xslt gnome org
- Subject: Re: [xslt] xsltproc memory consumption w/ large DTD / docbook
- Date: Wed, 29 Jan 2003 06:24:35 -0500
On Tue, Jan 28, 2003 at 04:13:52PM -0500, Daniel Veillard wrote:
> On Tue, Jan 28, 2003 at 09:59:13PM +0100, Michael Weiser wrote:
> > I've also tried narrowing down the source of that enormous memory
> > consumption. I had a look at the stylesheet and the only extraordinary
> > thing it does is loading those small external documents via document().
>
> 1/ document() parsing result *must* be kept in memory to comply
> with generate-id() requirements.
> 2/ parsing an XML file in XSLT, one *must* load the DTD to
> allow for defaulted attributes and ID/IDREF lookup, this is a
> requirement of the XPath data model and applies to resources
> loaded with document() too.
>
> Knowing that the DocBook DTD is an huge beast and can require
> as much as 3-4 Megabytes in the libxml2 DOM tree representation
> your results are perfectly normal.
> first point sharing DTD instances doesn't hold due to the
> fact that XML 1.0 allows an internal subset and sharing
> becomes a dangerous processing.
> However I think that in the document() case the DTD parts
> could be removed after the parsing process is finished. That
> could solve your specific problem.
Please try the enclosed patch, and report improvement or crashes.
This is a relatively dangerous change and I want to get feedback before
commiting any such change in CVS,
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
Index: libxslt/documents.c
===================================================================
RCS file: /cvs/gnome/libxslt/libxslt/documents.c,v
retrieving revision 1.15
diff -c -r1.15 documents.c
*** libxslt/documents.c 11 Dec 2002 17:53:36 -0000 1.15
--- libxslt/documents.c 29 Jan 2003 11:22:02 -0000
***************
*** 162,167 ****
--- 162,168 ----
xsltLoadDocument(xsltTransformContextPtr ctxt, const xmlChar *URI) {
xsltDocumentPtr ret;
xmlDocPtr doc;
+ xmlDtdPtr dtd;
if ((ctxt == NULL) || (URI == NULL))
return(NULL);
***************
*** 210,215 ****
--- 211,225 ----
*/
if (xsltNeedElemSpaceHandling(ctxt))
xsltApplyStripSpaces(ctxt, xmlDocGetRootElement(doc));
+
+ /*
+ * Remove the DTD from the document, it's not needed anymore.
+ */
+ dtd = xmlGetIntSubset(doc);
+ if (dtd != NULL) {
+ xmlUnlinkNode((xmlNodePtr) dtd);
+ xmlFreeDtd(dtd);
+ }
ret = xsltNewDocument(ctxt, doc);
return(ret);
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]