Re: Performance reading/writing .gnumeric files



On Sat, Jan 19, 2002 at 11:34:57AM -0500, Jody Goldberg wrote:
On Sat, Jan 19, 2002 at 07:21:30AM +0000, Nick Lamb wrote:

BUT this is import only and Gnumeric still bloats when you SAVE anything,
We're still using DOM for export which is even worse than DOM for
import becuase you need you use approx 4x the memory to store the
DOM tree, then double that to dump the tree into a buffer which is
then stored.  That is why we will be moving to a printf based
solution for xml export.

  Yep DOM uses much memory. And the input/output of gnumeric file
doesn't require random access to the document content, SAX make far
much sense. When I worked on the first version of the loader we didn't
had the same level of requirement and the DOM was fine.

That Content node is irrelevant.  When I loaded up one of JP's test
files and resaved it as
    <Cell attr="" attr="" ...>=foo()</Cell>

The file shrank from 2.5 Meg of uncompressed xml to 1.2 meg of xml.
Loading was then somewhat faster 25 seconds vs 29 seconds for a full
code start and exit.  With luck the sax importer will buy us at
least that level of improvement again.  From there we'll need to

  libxml2 SAX parses data XML at least at 5 MBytes/sec. The switch to
libxml2 should also improve this quite a bit. Even building the DOM should
be quite fast:

orchis:~ -> ls -l XSLT/tests/XSLTMark/db10000.xml
-rw-rw-r--    1 veillard www       2009240 Jan 17 10:39 XSLT/tests/XSLTMark/db10000.xml
orchis:~ -> /usr/bin/xmllint --noout --timing XSLT/tests/XSLTMark/db10000.xml
Parsing took 1370 ms
Freeing took 300 ms
orchis:~ -> 

  approximatively 1.7 seconds for building the DOM tree and freeing it.
and the file is approximately of the same size as your example.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]