Re: [xml] memory usage question

From: Daniel Veillard <veillard redhat com>
To: Tuukka Pasanen <tuukka pasanen ilmi fi>
Cc: xml gnome org
Subject: Re: [xml] memory usage question
Date: Wed, 17 Aug 2005 11:18:46 -0400

On Wed, Aug 17, 2005 at 06:06:47PM +0300, Tuukka Pasanen wrote:

Hi,
I runned this problem earlier an actually using dictionary saves much much
much memory. We had XML file around 20000 same set of tags and it took
something like 200Mb after dictionary it took only 20mb. So Dictionary
removes these cumulative memory usages..
I think I should post a little example because i've been playing with this
lately..


  Right, basically the dictionnary is used for "fixed strings" which are
likely to repeat over and over, in practice:
   - tag names
   - attribute names
   - in general markup names
   - blank strings used for indentation
   - very short PCDATA strings
the goal is to remove most of the redundance without risking an
explosion of the dictionnary size due to randomized content.
  The dictionnary also can speed thing up a lot (for example in XSLT)
by allowing direct string pointer comparison instead of comparing strings.
There is also a very fast xmlDictOwns routine allowing to check if
a string pertains to a given dictionnary.
  The dictionnary is used by default if one use the new xmlReadxxx APIs.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Follow-Ups:
- Re: [xml] memory usage question
  - From: Stefan Kost

References:
- [xml] memory usage question
  - From: Rob Richards
- Re: [xml] memory usage question
  - From: Daniel Veillard
- Re: [xml] memory usage question
  - From: Rob Richards
- Re: [xml] memory usage question
  - From: Daniel Veillard
- Re: [xml] memory usage question
  - From: Kasimier Buchcik
- Re: [xml] memory usage question
  - From: JC Oosthuizen
- Re: [xml] memory usage question
  - From: Tuukka Pasanen

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]