Re: gslice allocator: general impresions.

On Thu, 15 Jun 2006, Michalis Giannakidis wrote:

Dear Glib team,

I have recently used the glib-glice allocator in a very malloc intensive
application, and I would like to share a few observations I made, with you.

The typical size I send to g_slice_alloc is 20 and 76 bytes (32 bit build).
The total memory consumed by my application has decreased - which is great!
However it `seems' to me,  that this decrease in size is larger for the 64bit
build compared to the 32bit build. (I have modified gslice to align a chunk
to its own size so that I don't have unused memory between chunks eg: request
20 and get 24...)

you mean you've set P2ALIGNMENT to 1 * sizeof(void*) to get better granularity
of the slices? have you left NATIVE_MALLOC_PADDING at 2*sizeof(void*) at least?
(if not, memory fragmentation on some glibc systems and on all 64bit systems
can become etxremely bad).

In my application now I use both g_slice_alloc and malloc.Testing the new
allocator in linux I came accross the following:
After using g_slice_alloc to alloc a great amount of data (eg 1500mb), each of
them being 20 or 76 bytes in size, subsequent calls to malloc(8*1024) or
similar sizes take much longer (the allocation does _not_ got to the swap).

this is a magnitude of memory allocations where gslice hasn't been well
tested. (my machine only has 1GB of memory for instance).
while gslice should perform well even much beyond 1GB, i'd suspect that
malloc()/memalign() don't, and thus could result in *some* GSlice slowdown
as well.

Also if I chage the min_page_size from 128 to 4096, then the subsequent calls
to malloc(8*1024) or similar sizes take even longer. It takes for ever in the
64 bit build!

maybe due to your alignment tweaks. but maybe also because some malloc() buckets
will have even longer lists that way.

Moreover, even with min_page_size set to 128, the calls to malloc take for
ever on a Linux 2.4.20 glibc-2.2.4(glibc up to 2.2.5 has a broken
posix_memalign so I  fallback to memalign).

broken in what way?

In all of the cases above, malloc responds much faster when the gslice
allocator is turned off.

aparently some malloc implementations aren't too good at handling large numbers
of page allocations.
if modifying GSlice is an option for you, you could try to work around this at
the expense of read-only allocation of the memory used by GSlice.
to do that, you need to disable the HAVE_COMPLIANT_POSIX_MEMALIGN, HAVE_MEMALIGN and HAVE_VALLOC branches in allocator_memalign() and
allocator_memfree() so the read-only malloc()-based fallback allocator is
enabled. on 32bit systems, that'll by default allocate 16*4096=64k areas
to allocate pages from. by doing:
- const guint n_pages = 16;
+ const guint n_pages = 2560;
you'll instead allocate 10MB areas, which should put far less stress on the
system malloc() (that way, to fill 1GB, a maximum of 100 allocations are

I know I should be providing sample code  and test programms here to back up
my observations. I also understand that I should provide execution times
instead of just stating that the calls to malloc take for ever.

test programs would be great. allthough this kind of stress test can only be
run and examined on a limited set of machines. e.g. the development machines i
use have just 1GB and normally have only 30%-40% of free memory to spare
for tests during runs like make check -C glib/test ;)

I would like to ask you, if there are any known issues in mixing gslice and
malloc calls in malloc intensive applications.
Any suggested readings would be most  welcome.

you've already seen the two papers cited in the big comment block at the start
of gslice.c? it links to:
the first of which describes a kernel page allocator which isn't implemented
by GSlice. we use memalign() instead, on the assumption that this is good
enough for the vast majority of applications out there.
if malloc is too stressed out by the aligned allocations gslice makes in your
scenario, implementing this page allocator on top of mmap() and using that
as a base allocator for gslice may be another option.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]