Performance notes on background flickering

I have done some measurements on what causes the background flickering
when you move and resize windows. 

The measurements were done by the "sysprof" profiler which I have
recently checked into cvs. That profiler works with a kernel
module that in every timer interrupt generates a stacktrace for the
current process and writes that to a file in /proc. A user space program
then interprets this data and displays it in a GUI. This gives CPU
usage data on a system-wide basis. The module only works with 2.4
kernels (feel free to port it and send me patches).

The measurements were done on a Thinkpad laptop with an i810 chipset
using 8 of 256 MB as video memory. Before April 22, then GNOME terminal
would draw itself every time it got a ConfigureNotify event. The
measurements were done with a newer version that only does that when it
has a transparent background. Also the version of GTK+ used has a patch
to not draw behind mapped child windows (available in bug . All text is
drawn with subpixel smoothing.

The measurements were done by dragging a gnome terminal around on top of
several other gnome terminals with lots of text in them (an ls -l).

Highlights from the profile:

                - X                                 48%
                     RENDER                     15%
                     ConfigureWindow *          10%
                     WaitForSomething            2%
                     CopyArea                    2%
                     Noise ...

                        * includes painting background

                - Metacity (bluecurve)              35%
                     meta_frames_expose_event   28%

The break-down inside metacity looks like this:

    meta_draw_op_draw_with_env					24.28%
        draw_op_as_pixbuf()					11.72%
            meta_gradient_spec_render 				 8.62%
                meta_gradient_create_multi_vertical		 7.21%
	    scale_and_alpha_pixbuf				 2.68%
	get_gc_for_primitive					 3.20%
	parse_x_position_unchecked				 2.08%
	gdk_draw_line						 1.54%
	g_object_last_unref					 1.31%
	parse_y_position_unchecked                               1.16%

A lot of the drawing on my laptop is not accelerated. On my stationary
computer with the open source Nvidia driver, X is generally taking a lot
less CPU. RENDER is still around 12 to 15, so it stands out a bit more
on the X profile. Metacity is generally using more time than X on that

Note that GTK+ simply doesn't show up on this profile. Different
profiles where Nautilus windows are dragged over a GEdit instance shows
more time spent in GTK+, but metacity is still generally the big one,
depending on theme.

The biggest inside-GTK+ offender in the GEdit case is Pango, but nothing
truly dramatic (6-7%). For applications with many widgets, like
Gnumeric, the GObject signal emission overhead is at 10-15% of the total
cpu time.

Improving Metacity is likely to help a lot, for the reasons outlined in
this mail:

Basically the repaint lag is going to be proportional to

        P_a / (M - P_f)

Where P_a is the time it takes to paint an application, M is the time
between two mouse events, and P_f is the time it takes to paint the
frame. So improving metacity (ie. reducing P_f) means a non-linear
improvement in lagging.

The mail talks about this in the context of resizing, but it is valid
for opaque moving too. (And the resizing part of the problem is
completely fixed by the update counter approch that Havoc suggested. I
recently sent gtk+ and metacity patches to wm-spec-list and

Of course, COMPOSITE will drown the problem in memory, but that is still
a bit into the future.

Improving metacity:

A metacity theme is basically a tree of drawing operations, and drawing
a frame is done by traversing this tree. There are simple improvements
to this traversal that could be done

        - It generates a GC for each node in the tree. Simple caching
          should be beneficial here.

        - It parses expressions over and over. Converting to expression
          trees should be beneficial

	- Caching schemes for pixbufs may be possible

A different approach is to "compile" the tree into pixmaps. I think it
is possible for most themes to be compiled into a small number
of pixmaps and make the frame paint be just blitting those to the
screen. I think this would essentially make metacity disappear from the
profile, but it is a relatively big and non-trivial project.


