Re: [xml] libxml2 memory consumption

From: Daniel Veillard <veillard redhat com>
To: "Henke, Markus" <Markus_Henke ordat com>
Cc: "'Petr Tomasek'" <tomasek etf cuni cz>, "'xml gnome org'" <xml gnome org>
Subject: Re: [xml] libxml2 memory consumption
Date: Fri, 19 Apr 2002 09:33:41 -0400

On Fri, Apr 19, 2002 at 03:12:12PM +0200, Henke, Markus wrote:

-----Original Message-----
From: Petr Tomasek [mailto:tomasek etf cuni cz]
Sent: Friday, April 19, 2002 2:59 PM
To: Henke, Markus
Subject: Re: [xml] libxml2 memory consumption


On Thu, Apr 18, 2002 at 01:27:31PM +0200, Henke, Markus wrote:

Is this the "normal" relation of document size/
memory consumption or is something wrong with


Yes it is. You need several variables to be stored for each node.


Well, that's clear. But a ratio of 1:12?


  Depends on the ratio of markup vs. data in your XML.

Doesn't it mean that parsing a document using the DOM like API
is impracticable for a document size > ~ 20MB (on a "average"
machine)?
I'd hoped that there is a way to reduce memory consumption...


  Use the SAX API. DOM uses lot of memory, it's a know fact.
Or discard the parts of the tree your don't need as you build them.

BTW, Daniel, if I understand it well, each string (e.g. element name) 
is stored each time. (I mean, let's say you have xml document with
10000 times <something/> element, so you have 10000 times "something"
string in memory). Maybe we could use hash tables while parsing the
document and leaving literary same strings in one location?


  that's 10000 x (9 bytes + your libc allocator data)
  i.e. 90 KBytes + ???

And, it comes with a serious price. It would also break the ABI/API
and makes code harder to understand and more expensive to run.
I made some initial testing and it wasn't looking like it was worth
the effort at that time.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

References:
- RE: [xml] libxml2 memory consumption
  - From: Henke, Markus

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]