Re: Last version of the new summary storage engine

From: Philip Van Hoof <spam pvanhoof be>
To: Dirk-Jan Binnema nokia com
Cc: tinymail-devel-list gnome org
Subject: Re: Last version of the new summary storage engine
Date: Mon, 10 Dec 2007 11:59:12 +0100

Dirk-Jan Binnema nokia com wrote:

I tried sending a reply, but it seems that that failed. If you receivedone, here's another one :)

>  o. Duplicate strings are made unique in the mmap
Philip, I am a bit skeptical of this claim -- can you back thatup with some numbers?I mean let's take some *extreme* case, where the
same 10-byte string is seen 1000x in a mailbox; by having only
one of them we save about 10k. Not very much compared to the normal
size, even in this extreme case. At least the mem saving is
rather easy to calculate :)

The problem that this new summary storage is trying to solve is notaimed at reducing the amount of (virtual) memory being used (we callthis VmSize). If that would be a problem, I wouldn't have done thecurrent summary.mmap format the way I did it (because it wastes virtualmemory a lot).

The real problem is the following:

Assume we have a string "Dirk-Jan C. Binnema<Dirk-Jan Binnema nokia com>" as the From address of 20 messages. Ireceived some of these 20 messages a long time ago and some of them Ireceived very recently. Therefore are the strings in the summary locatedboth in the beginning, in the middle and at the end of the mapped file.

Those strings are, however, identical in content. Not in location, butin content (which is what matters); they are.

The problem with that is (imagine that I sort my list on "Dirk-Jan")that each and every one of the strings will be needed in real memorymodules. The kernel only knows about pages. A page is four kilobytes insize. This means that each page that contains such an identical string,must be paged (swapped) into real memory (we call this VmRSS).

Better would be if the string was only stored once, close in location toother frequently required strings. That way fewer pages would be neededin real memory modules (fewer VmRSS).

We call all of the memory, both real memory and just-mapped, if we cat/proc/$PID/status, VmSize: this is the virtual memory's size. Thisnumber is mostly worthless as it contains the memory needed for themappings (the libraries and the files that we mapped ourselves), thestack and the heap allocations (like gslice and malloc). Compared tothat does valgrind's massif tool only measure heap and stackallocations. This is also the most interesting part, as those will havea direct impact on the VmRSS.

For Tinymail what valgrind reports is not really the most interestingpart. Although what valgrind reports has a direct impact on VmRSS, mysummary's mapping has a larger impact on said VmRSS. When I say thatTinymail uses 10MB heap, I don't mention that Tinymail also has a mappedfile of 15MB that has frequently being trashed. VmRSS (the amount ofreal memory being used) will therefore probably be around 15 or maybe 20MB in stead of the 10MB heap that valgrind reported.

The VmRSS indicates the amount of memory that is being accessed. Thatis, the memory that is (therefore) in real memory modules. Compared tothe VmSize, this is a very important number (the VmSize is actually notinteresting at all, as it's a polluted number caused by the mappings ofthe libraries and because being part of the virtual memory doesn't meanthat you as a page are also actually going to be frequently used).

How do we further reduce VmRSS? The answer is simple: we make sure thatwe localize memory of the mapping efficiently. How do we do that? Byputting memory that is going to be frequently needed together in onepage in mappings. What memory will that be?

In my example will the string "Dirk-Jan C. Binnema<Dirk-Jan Binnema nokia com>" be needed 20 times. The string "TinneHannes <tinne hannes gmail com>" will probably be needed even more formy personal situation. However, Sergio will have an equal amount ofE-mails from you as I have, but probably only one from Tinne. Weconclude that in both cases your From string is important, but Tinne'sFrom string will only be important for me.

It's therefore quite likely that yours and Tinne's will be needed inreal memory modules on my tablet PC, and only yours will be needed inreal and Tinne's one will only perhaps be needed in real memory moduleson Sergio's tablet PC.

That's why I sort strings on their frequency of usage in the mapping.That way will frequently needed strings all be put in the same page.Less frequently needed strings will also be put together in their pages.

On speed: swapping from the mapped file (VmSize) to real memory (VmRSS)is a slow operation. The fewer times we have to do this, the faster ourapplication will be. Especially when doing things like searching andsorting.

The mantra "using fewer memory, results in a faster application" is verytrue in our case.

What is much harder, there's also a (significant?) *cost* to scan forduplicate strings, to reconstruct them and so on, when showing a folder.

With a hashtable-based implementation you don't need to scan forduplicates. You just add items using the string as key. Hashtables arefast at this because they make a hash of the string-key. These areinteger comparisons. Onces all items are in the Hashtable, you just walkthe table writing down each string, updating the index about thelocation (that's the "affected" list in the test).

Another reason why this will be a lot faster is because the size of eachmapped file will be 1000 items. The current mapped file simply containsall items. Rewriting a folder with 50,000 items means that we have to do50,000 x 4 string writes. With this summary storage we will only rarelyhave to uniq and sort a file with 1000 x 4 strings, a sequential indexand a sequential flags file (writing the sequential ones is too fast toeven mention, really).

Ironically we also write a smaller mapped file because we have fewerstrings to write (we made them unique): writing is slower than searchingfor duplicates. And that was a test on a fast harddisk, not even on aultra slow flashdisk. Although, indeed, a tablet's memory access will beslower than a PC's memory access too (it's therefore not a guarantee,but we'll test this of course).

I tested this with 100,000 items (with 50% duplicates, this is very muchlike the Linux mailing list's folder): this was actually faster thanwriting 50,000 items using the current summary format.

To me at least it's not obvious that this optimization will not actually
slow things down.

That's why I'm making this test. To test that :)

So, before rolling this out, I'd recommend you to do some numbers of the
difference with/without string clustering (this should include showing
the items).

Of course.

Thanks for being concerned. I hope my long reply clarified certain things.

I fear I only added confusion, though. Important is the differencebetween VmSize and VmRSS.

Follow-Ups:
- RE: Last version of the new summary storage engine
  - From: Dirk-Jan.Binnema

References:
- Last version of the new summary storage engine
  - From: Philip Van Hoof
- RE: Last version of the new summary storage engine
  - From: Dirk-Jan.Binnema

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]