Re: don't understand, memory leak or memory pools??



On Sat, 8 Feb 2003, Olivier wrote:

swap, and made sure the machine was quite out-of-memory, processes got
killed) and I guess that should have impact on the caching? Or did I
understand things wrongly?

Yes, slightly :)

(btw, thanks for actually /doing/ the googling -- most newbies on the
internet won't.  They typically won't even read the explanations they ask
so desperately for, so some years ago I stopped giving nice and polite
answers until they showed that were actually sufficiently interested in
the answer :) -- anyway, end of rant)


The standard implementation of malloc goes somewhat like this.

A process has a contiguous area of memory it can read and write which it
gets from the operating system.

Some of this has been carved out into little pieces that malloc has handed
out.  When free is called those pieces are kept in a list (typically
called freelist or something like that) so malloc can find a suitable
piece there instead of bothering the operating system.

If the freelist is empty or the pieces there are too small, malloc will
ask for more memory from the operating system by calling sbrk.  sbrk is a
Unix call that simply increases the size of the area that the process can
read/write.  Let me put that more clearly: it is a single, big area of
memory with no holes in it.  sbrk simply requests that the end of the area
be a specific address (*).  It could in principle be used both to increase
and decrease the size of the area.

When enough memory has been released by the user program through free()
calls, why doesn't the process shrink that area by calling sbrk again with
a lower address (**)?  Because there might be a single allocated piece of
memory at the top of the area...  Remember that the area (the "heap") may
be fragmented.  There is a no way (***) to compact the heap so all the
holes gather together into a single hole at the top which can then be given back
to the operating system.

There's also the empirical fact that when large amounts of memory has been
allocated and then freed the program will typically either exit shortly
afterwards (in which case it would be pointless to explicitly give the
memory back -- it will be reclaimed automatically anyway) or it will
allocate lots of memory again.  In some cases it won't but then most of
the memory might end up in swap anyway, so it won't really hurt.

This is not the whole story, of course.  Good memory allocators -- like
the one Doug Lea wrote which is used in Linux (****) -- will do something
different if the user program tries to malloc very big chunks.  Instead of
using sbrk, it will mmap from /dev/zero -- a Unix/Linux idiom for getting
the memory at just about any virtual address that will make the operating
system happy.  Those big chunks /will/ be given back to the operating
system when freed.

So the upshot is:

 * in most cases don't worry about it.  It is a sensible optimization
   which won't hurt anybody.

 * the memory allocator won't know whether your machine has any swap space
   or not.  It also won't know what other memory hungry processes you may
   have running at the same time.

 * if you really care about it (because your use case is unusual) then
   learn something about memory pools, arenas and all that.  I'm afraid
   you will have to understand the "technical" details :)

 * the memory allocator can usually be tuned through the setting of
   various variables or use of special function calls.  These won't be
   standardised, of course, so your program will become non-portable.  The
   same goes in most cases for what I wrote about memory pools and arenas
   above.

 * I think glibc's g_malloc function can be tuned, too.  In particular, I
   think you can override which function it /actually/ uses to allocate
   the memory (since you are the one with the problem, not I, I can't be
   bothered to check the details).  That might be a more portable way
   forward if you /really/ need it -- which I don't think you do, to be honest.


*) actually, brk() asks for that the data segment ends at a specific
address, sbrk() asks for a specific increase of the data segment.
Details, details, details.

**) well, some of them actually will do that sometimes (including Doug
Lea's).  It's just not very often both 1) possible and 2) worth the
bother.

***) Yes, there is something called garbage collection, which is typically
used in high-level languages such as Lisp, Haskell, ML, etc... ;) -- there
is also the Boehm conservative garbage collector for C/C++ but that's
another story for some other time.

****) the one in glibc is based on an older version of Doug Lea's version,
so the newest of each won't quite match.

-Peter




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]