Re: A tale of waiting

From: Kristian Rietveld <kris gtk org>
To: Benjamin Otte <otte gnome org>
Cc: gtk-devel-list gnome org
Subject: Re: A tale of waiting
Date: Thu, 25 Jun 2009 12:12:05 +0200
Hi,

A brief reply on the tree view bits.  I have to admit that I've not
looked at the code.

On Tue, Jun 23, 2009 at 10:16 PM, Benjamin Otte<otte gnome org> wrote:
> The first thing I did was set a goal. I decided to target the ls
> command, because I know it to be fast and because there's no reason my
> terminal should be faster than my file chooser for listing files -
> even if Behdad and the Chris'es maintain it. So i ran time ls -la >
> /dev/null, it took about 0.1s. This means that everything that takes
> more than 2% total CPU time needs to go away. I should note that I do
> not care about memory usage a lot, only about execution speed, as a
> file chooser should go away quickly and give you back your memory.

As a side comment, I am not really sure how fair it is to compare a
full blown GUI to a command line utility with its output redirected to
/dev/null.  I do agree that a load time of 4 to 6 seconds is too long.

>  * Sorting the list
> The file chooser uses a GtkTreeModelSort to sort. A huge amount of CPU
> was spent trying to sort. And when looking at the ls time (which does
> sort its output), this shouldn't happen.

I would guess this is due to the files being added to the model one by
one, which means that they are sorted one by one by the
GtkTreeModelSort.  This uses insertion sort on an array, which is
slow.

> Next I implemented sorting. Because I use an array, I can use qsort(),
> which is fast. I had thought about switching to GSequence to get ever
> closer to GtkListStore, but I did not do that, and one of the reasons
> is that GSequence uses insertion sort, which is quite slow. Sorting
> 40.000 elements takes over a second in a small test I did.
> After I had implemented sorting, the file chooser was abysmally slow
> again - taking around 25s. This was for two reasons:
>  * The sort function was called way too often. So I added the option
> to freeze and thaw the tree view. Now the sort function was only
> called once or twice.

Where is this sort function located?  Judging from the following comment ...

>  * gtk_tree_model_get() being called in the iter comparison functions
> took almost all the CPU. 25% of the time was due to needlessly copying
> GValues, but 75% of the time was spent in lookup of the GtkTreeModel
> interface. As the comparison function is in a critical spot for
> performance, this was inacceptable. Also, using
> gtk_tree_model_get_value() directly didn't get rid of any of these
> problems. So I added an access function to GtkFileSystemModel and the
> problem was gone.

... I guess the sort function is located in the file chooser dialog
(as opposed to being inside the model) and thus has to use the
gtk_tree_model_get() calls.  Since the sort function is called often
and the main part of the sort function is to fetch the values needed
for comparison from the model, it is not weird that there is a large
overhead introduced by GObject's implementation of interfaces.  I
don't know how many different sort functions are used in the file
chooser.  If it is only one, you could make GtkFileSystemModel
implement the GtkTreeSortable interface (I get the feeling you've
already done that) and move the sort function to be internal to
GtkFileSystemModel (so it can access all values directly).  Make it
the default sort function that can be overridden using the
GtkTreeSortable interface.  This is in fact a circumvention of the
GtkTreeSortable and GtkTreeModel interfaces, but it does eliminate the
GtkTreeModel overhead you are seeing.  (Also, if the default sort
function is overridden, that new function will suffer from the model
overhead again.)

Adding freeze and thaw to tree view is not an option.  It has been
designed in the past to live without freeze and thaw and the addition
of these functions has been repeatedly refused.  It would be good to
know why the sort function is called so often -- is this because the
files come in one by one?

>  * evaluate fixed height mode for the tree view, it might make things
> even faster without losing features.

Yes, this will remove the full pass through the model.  More on this below.

> There are some remaining performance issues, that I didn't touch on
> yet, but that are quite important for really catching up to ls.
>  * 50+% of the remaining CPU is spent in GtkTreeView's validate_row().
> I have no clue what that function even does. Is there a way to get rid
> of it?

validate_row() is the main part of the validation functionality.  What
it basically does is, given a row, it will set the data (retrieved
from the model) on all cell renderers and then call the get_size()
method on each renderer.  When done, the tree view knows the minimal
size this row needs to display itself.  The proper row size is then
determined and set.  By default, tree view does a full pass through
the model to do this for each row.  At the end, the sizes of the
scroll bars will be fully correct.

Fixed height mode bypasses this by only measuring the first row, and
then set the same size for each row.  Rows cannot have different
heights in this mode.

A good while ago I have been experimenting with removing the fixed
height mode and also removing the full pass done through the model by
default.  Instead, tree view will only validate (i.e. measure) the
rows that are currently visible.  When you scroll, other rows become
visible and are subsequently validated.  Based on the heights of all
rows validated up to a certain point, it will make a guess for the
entire size of the tree view and update the scrollbars. This guess is
"good enough" for almost all cases, especially when the row count
grows bigger.  The guess becomes more accurate as more rows are seen.
Seemed to work very well and resulted in lightning fast large (4
million+ rows) variable-height tree views.

I've more things (mostly refactorings) sitting around that I really
want to get in a shape to be committed to trunk.  What has been
holding me off until now is a total lack of time (this will most
probably change soon, yay) and the lack of an automated test suite for
tree view.  If somebody has ideas on how to automatically test
layouting and drawing algorithms for correctness, that would be
appreciated ;).

> And now the obvious question: Should I just merge it to master when
> I'm done with the regressions?

I think this has been answered already.


regards,

-kris.
Follow-Ups:
- Re: A tale of waiting
  - From: Benjamin Otte
- Re: A tale of waiting
  - From: Jernej =?utf-8?Q?Simon=C4=8Di=C4=8D?=
References:
- A tale of waiting
  - From: Benjamin Otte
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]