Changing chunk size of {GList,GSList,GNode} allocators



Currently, the GList, GSList, and GNode allocators are set to allocate
pools of 1024 items (resulting in 12KB, 8KB, and 20KB pools, respectively,
on a 32-bit machine)...

[skip to the end if you don't want to read my -long- crapola, which is
mostly a scratchpad of figuring out why the changes are needed, and just
want the patch]

The reason allocators/memchunks are used is to avoid the CPU & memory
overhead of mallocing the items individually. Setting these allocator
pools to 1024 items in size results in a waste compared to an optimal
size. Here's (a long explanation) why:

On Linux glibc systems, malloc() has an overhead of four bytes per block
allocated. Here are the numbers of bytes that are wasted when various
amounts of the items are malloc'd:

# of items	bytes wasted
1		4
5		20
10		40
50		200
100		400
250		1000
500		2000
1000		4000

Now, when using memory pools of a certain size, you will obviously only
have an overhead of 4 bytes per memory pools. Assuming that there is a
random distribution of the maximum number [1] of items allocated, there
will be on average half of a memory pool unused in a program.

This leads to this chart (it ignores the negligible fixed overhead of
creating a memchunk):

Avg bytes wasted = 4+(pool_nitems/2*sizeof(item))
avg % wasted = nbytes_wasted/(pool_nitems*sizeof(item) + 4)

		       - average bytes wasted -
Pool items      GList           GNode           GSList         
1               4               4               4
2               16              24              12
4               28              44              20
8               52              84              36
16              100             164             68
32              196             324             132
64              388             644             260
128             772             1284            516
256             1540            2564            1028
512             3076            5124            2052
1024            6148            10244           4100

Now, as the pools get bigger and bigger, the per-pool overhead is less
important, but when using bigger & bigger pools the average number of
items wasted goes up.

Right now an average of 6K is being wasted by GList, 10K by GNode, and 4K
in GSList, in almost every glib program. If we assume that 10000 items are
going to be allocated of each (which is a BIG overestimate at this point -
the maximum usage of GList nodes in testgtk is around 500, and even in
balsa "abuser of GList" Pav said the usage is around like 5000 nodes) then
my calculations say that the best pool size for each pool is:

'b' = bytes wasted
'p' = num items in pool
'i' = size of item

Find minimum b for i=sizeof(GSList), sizeof(GList), sizeof(GNode) 
aka i=8,12,20

   per-pool waste   avg unalloced bytes
b= (10000/p*4)    + (p/2*i)

GSList:
	b=40000/p + p*4
	dp/db=-40000/(p*p) + 4

	solve for the minimum
	-4=-40000/(p*p)
	1/10000=1/(p*p)
	p=sqrt(10000)

	p=100
	b=800 at p=100
	b=800.04 at p=101
	b=800.04 at p=99
	p=100 is a minimum.
GList:
	b=40000/p + p*6
	dp/db=-40000/(p*p) + 6
	-6=-40000/(p*p)
	40000/6 = p*p
	p=sqrt(40000/6)
	sqrt(40000/6) = 81.650

	p = 82
	b=979.80 at p=82
	b=979.83 at p=81
	b=979.93 at p=93
	p=82 is a minimum.

GNode:
	b=40000/p + p*10
	dp/db=-40000/(p*p) + 10
	-10=-40000/(p*p)
	p=sqrt(40000/10)
	sqrt(4000) = 63.246

	p = 63
	b=1264.9 at p=63
	b=1265.2 at p=62
	b=1265 at p=64
	p=63 is a minimum.

Someone double-check my math, please.

Yes, there is a slight slowdown, because if a program uses a maximum of
10000 GList nodes, instead of being called 10 times to allocate a pool of
1024 items, malloc will be called 122 times to allocate a pool of 82.
However, the memory savings of 6K is worth more than the extra microsecond
of CPU time.

So, I would like to request that the chunk size of each allocator be
changed to chunk sizes that will produce a lower average # of bytes
wasted. A patch to change it to the follows .signature, for reference
purposes.
-- Elliot
"In film you will find four basic story lines. Man versus man, man
 versus nature, nature versus nature, and dog versus vampire."
    - Steven Spielberg

[1] I say "maximum" because GList, GSList, and GNode all use ALLOC_ONLY
GMemChunk's - once an item is allocated from the memory chunk, it is never
returned to the free list. If we could predict this maximum number
accurately, we could use it to set some hints for glib, but it varies from
application to application.

Index: glist.c
===================================================================
RCS file: /cvs/gnome/glib/glist.c,v
retrieving revision 1.8
diff -u -r1.8 glist.c
--- glist.c	1998/12/16 05:38:22	1.8
+++ glist.c	1999/01/19 15:32:01
@@ -102,7 +102,7 @@
   if (!current_allocator)
     {
       GAllocator *allocator = g_allocator_new ("GLib default GList allocator",
-					       1024);
+					       82);
       g_list_validate_allocator (allocator);
       allocator->last = NULL;
       current_allocator = allocator;
Index: gnode.c
===================================================================
RCS file: /cvs/gnome/glib/gnode.c,v
retrieving revision 1.8
diff -u -r1.8 gnode.c
--- gnode.c	1998/12/16 05:38:25	1.8
+++ gnode.c	1999/01/19 15:32:01
@@ -108,7 +108,7 @@
   if (!current_allocator)
     {
        GAllocator *allocator = g_allocator_new ("GLib default GNode allocator",
-						1024);
+						63);
        g_node_validate_allocator (allocator);
        allocator->last = NULL;
        current_allocator = allocator;
Index: gslist.c
===================================================================
RCS file: /cvs/gnome/glib/gslist.c,v
retrieving revision 1.7
diff -u -r1.7 gslist.c
--- gslist.c	1998/12/16 05:38:26	1.7
+++ gslist.c	1999/01/19 15:32:01
@@ -102,7 +102,7 @@
   if (!current_allocator)
     {
       GAllocator *allocator = g_allocator_new ("GLib default GSList allocator",
-					       1024);
+					       100);
       g_slist_validate_allocator (allocator);
       allocator->last = NULL;
       current_allocator = allocator; 











[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]