Re: Request: Test suite for EFS.
- From: Ian McKellar <yakk-gnome-list yakk net au>
- To: Miguel de Icaza <miguel gnu org>
- Cc: mjs eazel com, michael nuclecu unam mx, Daniel Veillard w3 org, gnome-components-list gnome org, gnome-list gnome org
- Subject: Re: Request: Test suite for EFS.
- Date: Thu, 17 Feb 2000 12:30:01 +0800
On Wed, Feb 16, 2000 at 06:01:06AM -0600, Miguel de Icaza wrote:
>
> [Michael: comments on OLE2 at the end]
>
> > Disabling searching to avoid parsing the whole file sounds lame to
> > me. XML is definitely structured. Storing images in it should not be a
> > problem, see the RFC 2397 for one way to do it. Storing embedded
> > objects should be no problem either as long as they serialize to
> > XML. XML is perfectly happy to let you use multiple DTDs in one file.
>
> People from the Windows world are used to multi-megabyte files. Some
> of the Gnumeric test cases for Excel loading are pretty large.
>
> If we use XML exclusively, I wonder who is the brave soul who will be
> scanning a directory for information with an XML file. Consider a few
> hundred files on a server, and you are looking for documents that have
> been edited by "Maciej" at some point in life.
>
> I can picture the disk IO action going up, the memory usage going up
> and the time going up.
>
> Can you picture a way in which this could be solved with XML?
In your structured document DTD specify that the <summary> must be the first
element in a <structureddocument> so you have:
<structureddocument>
<summary>
<authors>
<person>Tobermory J. Womble</person>
</authors>
<editors>
<person>John Q. Public</person>
<person>Maciej X. Ample</person>
</editors>
</summary>
...
You use an event driven XML api like SAX (which I think gnome-xml does) and
feed data into the parser 1k at a time from the start of the file till you
get </summary>. With XML you _don't_ need the whole document in memory.
You know that if there is a summary then it will immediately follow the
root <structureddocument> element - if it doesn't then you can ignore the
document. I can't see this being an more memory or disk intensive than an
EFS or OLE2 file - probably slightly more CPU intensive, but not by much -
and you avoid all the platform independance issues. We could even write a
minimal XML parser to do this. I wrote an XML parser the other day because
gnome-xml is too complex for me, and it was about 180 lines of C (using
glib) - a simple SAX parser would be much less.
ALSO: as we only need to read, mmap()ing rather than read()ing is feasible.
Ian
--
Ian McKellar | Email: yakk(a)yakk.net | Web: http://www.yakk.net/
Fax: +61 (8) 9265 0821 / +0 (775) 205 0307 | Home: +61 (8) 9389 9152
If God didn't want us to eat animals, he wouldn't have made them out of meat.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]