Re: [Evolution-hackers] IMAP summary synchronization improvements



On Tue, 2006-10-24 at 13:34 +0200, Philip Van Hoof wrote:

> The only API call that makes this incompatible with a normal Camel is
> the camel_folder_summary_dump_mmap. If you diff camel-imap-folder.c of
> Evolution's Camel with this camel-imap-folder.c of Tinymail's Camel, and
> remove those API calls, you'll have a working Camel that doesn't use the
> mmap stuff but that will receive and update summary faster from IMAP,
> using less memory.

A new memory analysis reveals that this implementation in stead of
topping at 12 MB memory consumption for 30,000 headers (which came from
35 MB memory consumption being consumed by today's Evolution's Camel
when receiving 30,000 headers), it now tops at 8MB. 

In other words, 35 / 8 = 4 times less memory.

Bandwidth saving: the original version was split up in three major
contributors to bandwidth consumption.

The current Evolution's Camel's implementation:

- Part one checked for changes against the existing on-disk cached
headers. This only verifies the flags, doesn't consume a lot memory nor
bandwidth and is therefore unchanged.

- Part two received all the possibly needed headers for all headers that
are new one. Keeping a copy in a GData which is stored in a GPtrArray
until part three (and part four and five) are finished.

- Part three received for all headers that where incomplete (or, I don't
know, something that didn't work out or ... ?) the remainder of the
information needed.

- Part five and six basically merged part two and three together and
instructed the summary infrastructure to add the information as summary.


My improved implementation:

- Part one is exactly the same

- The next parts are kept in a shortened loop (max 1000 headers at a
time)

  - Part two receives all the UIDs of headers that where not in part one
but that are believed to be new. It only receives UIDs, it doesn't ask
any other header yet (only 1000 at a time).

  - Part three receives for all those UIDs the exact amount of needed
headers (and not a header more), creates CamelMessageInfo instances,
instructs the summary infrastructure to store it and instructs the
summary information each 1000th header to dump stored items to disk and
reload them using the mmap technique (which moves 56% of the memory from
allocated to mmapped address space).



The reason why there's still memory consumption are the CamelMessageInfo
struct instances and the 1000 last stored headers. Of course a few for-
me-yet unknown allocations (but I will find them sooner or later).

I believe I can shrink memory consumption of downloading information
from the IMAP service (updating the summary) from 8MB to 6MB for 30,000
headers.

If you want to reproduce these results, there's a IMAP account available
at mail.tinymail.org, u:tinymailunittest, p:unittest. It has folders
named "30000", "20000" which contain ~30,000 and ~20,000 headers of pure
spam spam and more spam (which is extremely good testing material, very
few spam's headers are identical. So very few pstring memory saving, so
a very realistic worst-case scenario).


-- 
Philip Van Hoof, software developer
home: me at pvanhoof dot be
gnome: pvanhoof at gnome dot org
work: vanhoof at x-tend dot be
blog: http://pvanhoof.be/blog




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]