[Evolution-hackers] GStringChunk - not all it's cracked up to be.



With these new patches coming in pushing the idea of moving to a
GStringChunk here and there, at first it all sounds well and good, but
then I looked at the source code to GStringChunk because I had a feeling
of how it was implemented and based on that feling, had some bad
feelings about how it handled situations that would be extremely likely
to cause a more bloated usage of memory than aught be the case.

Sure enough, GStringChunk does indeed handle these situations in a most
nightmarish fashion... it will waste huge chunks of unused memory in any
use-case that is not "ideal" (e.g. ANY case where string lengths are
inconsistant).


In order for GStringChunk usage to actually make sense, you need to
optimise the "size" argument to g_string_chunk_new() which is not an
easy task to do for strings of which you have no control of (unless of
course, you simply use the insert_const method... in which case, why
even use a GStringChunk at all? simpler to simply use a GHashTable,
because that's all g_string_insert_const is really doing anyway).


So how does GStringChunk work? Well, it's basically a linked list of
memory chunks each "size" bytes long (or longer if the string in that
particular chunk is longer than "size" bytes).

When inserting a new string into the GStringChunk, if the string is
longer than the remaining number of bytes in the head node of the chunk
list, then a new node is prepended and filled with the inserted string.
Any remaining space is then calculated and saved for comparison to the
strlen of the next string to be inserted. Repeat.


So say we choose to create a new string chunk of 1024 bytes. Now we
insert a string of 624 bytes. That fits into the head node of the chunk
because 624 < 1024. That leaves 400 bytes remaining in the chunk of the
head node.

The second string we insert is, say, 724 bytes long. Oops, 724 > 400, so
we have to create a new node w/ a new chunk of size 1024. This new node
now becomes the head node, and the GStringChunk data structure is
updated to remember that it has 300 bytes left in the head node (but it
doesn't know how many bytes are left in the original node anymore... so
oops, 400 bytes wasted - and counting!!).

The third string we insert will be 400 bytes for optimum bloatage of our
GStringChunk example. 400 > 300, so, again, we must create a new chunk
of 1024 bytes to insert our string into. GStringChunk records that it
has 624 bytes available in the head node.

Net result?

3 strings inserted

total required memory:  1748 bytes
total memory allocated: 3072 bytes + GHashTable overhead + GSList
overhead + GStringChunk overhead
total memory wasted:    1324+ bytes


-- 
Jeffrey Stedfast
Desktop Hacker - Novell, Inc.
fejj novell com  - www.novell.com




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]