Re: [Evolution-hackers] compression for folders?



I wish for everyone involved in this discussion to read my comments on
bug 23621.  It's obvious from some of the comments that it has not been
read.  I see no point in re-creating it here, as I was told that is the
proper medium for appending constructive advice like comments.

That said, please understand:

- compression could be made (in my mind) completely transparent if a new
folder that had compression in mind were given life
- compression could be optional (you don't use it until you opt to
convert a folder to a compressed format folder)
- archiving is separate from compression
- yes, it's fast to append to gzip and even bzip2 data streams, and yes
it takes some cpu time to recrunch them; for this reason in my proposed
new folder type I suggested grouping messages to allow for a fair
tradeoff between too big of an mbox as a single gzip stream vs every
message compressed individually, both of which have obvious
objectionable qualities (time vs space, respectively)
- one could even allow for a background thread or a manually invoked
thread that recompresses things in the background for a tighter fit;
access time doesn't suffer, quick writes don't suffer, but recompressing
can reclaim more diskspace especially if one opts to allow the
recompressing program to attempt multiple algorithms to determine the
tightest packing algorithm for a given dataset

Hopefully this will make it clear that, in my mind, short of manpower,
the concepts of compression could be done in such a way that would not
be objectionable to anyone.

On Mon, 2004-05-10 at 22:17, Not Zed wrote:
> On Mon, 2004-05-10 at 16:02 -0700, Ray Lee wrote: 
> > On Mon, 2004-05-10 at 15:28, Jeffrey Stedfast wrote:
> > > you are forgetting the fact that folders are generally not read-only,
> > > and so in order to write any new data to the gzip file, you'd have to
> > > rewrite it from scratch which negates any speed improvements you could
> > > possibly claim.
> > 
> > ray:~$ echo hello | gzip >test.gz
> > ray:~$ echo world | gzip >>test.gz
> > ray:~$ zcat test.gz
> > hello
> > world
> > ray:~$
> > 
> > As long as the archive folders only support appending, there's no need
> > to rewrite the entire file. Further, there's no need to even keep it in
> > one big file (and many good reasons not to). Partition the archives by
> > month, or something.
> 
> FWIW there is actually a reason to store them in one compressed stream
> (vs catting them or separate files).  It will compress a lot better,
> one large stream vs many smaller ones, there is a lot more redundant
> data to compress.  Particularly considering the typical size of email
> messages.
> 
> > > also, as a curiosity, I actually tested this theory and it doesn't hold
> > > true. reading/inflating a gzip file off disk is no faster than reading
> > > the non-compressed file off disk, *and* inflating the gzip file pegs the
> > > cpu so if the app was doing other things then it would negatively impact
> > > performance of those other operations.
> > 
> > This rather obviously depends on CPU speed versus disk speed, yes? If I
> > had a modern CPU with a device that had a transfer speed of 1 byte a
> > second, compressing the stream is an obvious win. If I have a device
> > with a transfer speed of 1 GB/s, it's an obvious loss.
> It also depends on other factors like i/o readahead, async i/o etc.  I
> remember doing an async i/o based GIF decoder on an Amiga 500.  It
> could decode raw gif at about the speed it could be loaded off floppy
> (hmm, 7mhz!), without async i/o it bit, but with async i/o it was much
> faster than loading the raw image would have been.  Still, compression
> is usually much more expensive.
> 
> 
> Michael Zucchi
> <notzed ximian com>
> 
> Ximian Evolution and
> Free Software Developer
> 
> 
> Novell, Inc.
-- 
Todd Fries .. todd fries net

 _____________________________________________
|                                             \  1.636.410.0632 (voice)
| Free Daemon Consulting, LLC                 \  1.405.227.9094 (voice)
| http://FreeDaemonConsulting.com             \  1.866.792.3418 (FAX)
| "..in support of free software solutions."  \  1.700.227.9094 (IAXTEL)
|                                             \          250797 (FWD)
 \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
                                                 
              37E7 D3EB 74D0 8D66 A68D  B866 0326 204E 3F42 004A
                        http://todd.fries.net/pgp.txt








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]