Text Widget issues (forked from: Possible Glib BTree substitute)



>  - The tree nodes are not really a memory problem; they contain 
>    9 32-bit values, and a 50,000-line buffer results in about 
>    7,700 nodes; 36 bytes per node times 8000 is under 300K. 
>    Of the 9 elements in the node, all 9 (I think) are also 
>    required in a skip list node. So you can only improve on 
>    memory by reducing the number of nodes, and even then the 
>    nodes are a small percentage of total memory used once you 
>    count lines, wrapped display lines, character data, and other 
>    stuff that either data structure will have.

	However, every single Line (FrooTkxtLine) also has a BTree node
associated with it (or rather, every node with level zero has a line
associated with it).   I believe you have only counted "parent" nodes
above, and have discounted the nodes that have a level of zero.  So, in
addition to your 300K, you have another 36 bytes of memory for every
single line, in addition to the FrooTkxtLine struct for each line.

	I think it's fair to say that many (if not most) lines of C code
(as an example text file) have fewer than 36 characters, so using a
structure other than something like a gapped text buffer is very
expensive.  I used to think Nedit was expensive because it uses a second,
parallel "style buffer" that basically doubles the amount of RAM needed to
hold the text.  But a 100% increase is nuthin' :).

	I'll have the same problem with my skip list--except that the
nodes at level zero won't have "num_lines".  Also, with the skip list I
won't need "num_children", which amounts to another 4 bytes per line
savings.  I should be able to trim off quite a bit of fat once I switch
over to a singly-linked skip list--but 4 bytes of that will go into the
generic "data" pointer that I am using to keep my data structure generic,
rather than single-purpose.

	I plan on also adding another 4 bytes for "gint pixel_height", the
calculated total height of all the DLines for a given line.  Then, I'll
propagate that sum up the tree, just like num_chars or num_lines, and
I'll always know my total height at the cost of one int per line.

>  - The only user-visible speed problem with the BTree, with a 
>    50,000-line buffer, is re-wrapping a lot of lines. gprof
>    reveals most time spent in gdk_char_width(), with the second 
>    largest amount of time spent retrieving a list of tags 
>    applied to a given point in the buffer.

	How often does this get calculated?  Do you store the calculated
result anywhere?

>    Pango will increase the gdk_char_width() time. This is really 
>    the huge bottleneck.

	The way I picture it now is to do the total_height calculation
once, when the file is loaded (or after syntax highlighting is done) and
then just maintain that number, like num_chars or anything else.

>    So, this will be faster if the skip list is shallower than the
>    tree.

	The height of the tree is determined by the probability you're
using for the skip list.  I'm currently using a probability of .5,
hardcoded in.  I'd like to make that user-settable but I don't know where
I'd store the value.  head->level[0].data is an option, I suppose...


--Derek




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]