Re: [xslt] xsltproc memory consumption w/ large DTD / docbook



On Wed, 29 Jan 2003, Daniel Veillard wrote:

> > Daniel: Can you give me a hint where to best start looking for freeing
> > unneeded memory other than xsltLoadDocument?
> superfluous DTD informations. get a debugger, put a breakpoint at the
> place of my patch and see what's happening, I can't work on it
> at the moment.
Sorry for bothering you that much. Attached is a patch that does the trick
for me. Memory consumption is down to 9MB and slowly grows up to about
15MB whilst processing that same autolayout.xsl that got OOM-killed at
900MB before. I'm just now re-transforming my whole website and it seems
to work just fine for the other documents as well.

I think the actual problem was that xmlGetIntSubset only returned the
internal subset but not the external. Therefore I just grabbed the
DTD-freeing code from xmlFreeDoc and put it into xsltLoadDocument. I'm
sorry I couldn't integrate it less ugly but I know too little of
libxml/libxslt coding to find the right place and way to do it.

Thanks for your help and keep up the great work.
-- 
bye, Micha
--- libxslt-unpatched/libxslt/documents.c	Wed Jan 29 19:11:07 2003
+++ libxslt/libxslt/documents.c	Thu Jan 30 10:43:36 2003
@@ -162,6 +162,7 @@
 xsltLoadDocument(xsltTransformContextPtr ctxt, const xmlChar *URI) {
     xsltDocumentPtr ret;
     xmlDocPtr doc;
+    xmlDtdPtr extSubset, intSubset;
 
     if ((ctxt == NULL) || (URI == NULL))
 	return(NULL);
@@ -210,6 +211,21 @@
      */
     if (xsltNeedElemSpaceHandling(ctxt))
 	xsltApplyStripSpaces(ctxt, xmlDocGetRootElement(doc));
+
+    extSubset = doc->extSubset;
+    intSubset = doc->intSubset;
+    if (intSubset == extSubset)
+	extSubset = NULL;
+    if (extSubset != NULL) {
+	xmlUnlinkNode((xmlNodePtr) doc->extSubset);
+	doc->extSubset = NULL;
+	xmlFreeDtd(extSubset);
+    }
+    if (intSubset != NULL) {
+	xmlUnlinkNode((xmlNodePtr) doc->intSubset);
+	doc->intSubset = NULL;
+	xmlFreeDtd(intSubset);
+    }
 
     ret = xsltNewDocument(ctxt, doc);
     return(ret);


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]