Re: Some performance notes



Owen Taylor <otaylor redhat com> writes:

> In general, the signal emissions aren't retrieving information
> so I don't think caching is relevent. An exception to this is
> size-request. There might be some noticeable improvements
> for resizing if we:
> 
>  - Add a private flag for the widget "has requisition"
>  - Clear the flag initially and on queue-resize
>  - Set the flag after ::size-request.
>  - Use widget->requisiiton instead of emitting ::size-request
>    when the flag is set.

The measurements I did indicated that the size negotiations didn't
really take up much time.  Redrawing was the real time-consumer.  But
if redrawing is reduced/sped up, this may be a good idea.  

As far as I can tell, any widget that used its requisition field for
something else than its requisition would be badly broken.

> Unfortunately, widgets have been known to store things other than
> their actual allocation in widget->allocation, which poses a problem
> for this.

I personally wouldn't mind if such widgets were considered broken and
make a note about it in Changes-2.0.txt, but I don't know if that
would acceptable for users.

> > To experiment with interactive performance, I made panes resize
> > opaquely and resizes take effect immediately. 
> > Then I put a VPane in the toplevel window in testgtk and ran the
> > whole thing under a profiler and tried to extract from the
> > call-graph where the time was spent /when responding to events/.
> > 
> > The result was that by far the most time was spent redrawing (
> 
> I'm curious what machine you were doing this testing on. 
> (CPU, video card, and amount of ram on the video card)
> I never got any noticeable performance problems with your opaque
> paned resize patch on a :

I wouldn't call it a performance problem - it is not like gtk was
unusable or very slow, it was just not as snappy as it could be.

>  celeron 400mhz, 32meg matrox g400, 1280x1024 16bpp
> 
> So I assume your setup must be considerably slower.

It is a pentium pro 200 MHz with a 4MB Matrox Milennium at
1600x1200 16 

> > and a non-negligible amount on gtk_widget_style_get()).  
> 
> Yes, I saw this too. I think g_param_spec_pool_lookup() may
> need a good kick in the rear. 

Or some sort of caching? The same styles on the same widget are looked
up over and over during interaction.

> > This is good, because drawing means something entertaining is going on
> > on the screen, which is another way of saying that it feels snappy.
> > It also means that if resizes are handled this way, speedups to the
> > drawing code immediately translates into snappyness.  In fact, it
> > doesn't even have to be speed-ups - just moving time-consuming stuff
> > out of the interaction loop helps a lot.  
> 
> The current idle approach for drawing and resizing should work almost
> exactly as well as compression as long as the toolkit can handle
> events as fast as they come in. As long as you push expensive handling
> to the idles, there should be no reason GTK+ can't handle events as
> fast as they come in.

On my (slow) machine, I see a noticable delay between opaque resizing
and the actual effect. This is especially noticable when the widget
tree is large. This effect is reduced with -dumbSched, but it doesn't
completely go away.

>   Queue configure event 1             q
>   Handle event 2                      x
>   Handle event 3                      y
>   Queue configure event 4             q
>   Handle queued configures            r
> 
>   Compress configure event 1 and 4 and handle it
>   Handle event 2
>   Handle event 3
>    
> Aren't significantly different performance-wise as long as
> the event handling is much faster than handling the 
> configures, which I believe is the case for GTK+, or 
> at least should be the case.

If the queueing of a resize takes longer than the interval between two
configure events, then gtk will never enter the idle loop, and the
resize will never take effect.

After I tried starting the server with -dumbSched, I must admit that
it doesn't look like compression is going to do much better than
queueing. I still want to do some experiments, though:
        
        - write a program that produce numbers on
                - how long is the interval between configure events
                - how long does it take to queue a resize
                - how long does it take to reallocate a large widget
                  tree
                - how long does it take to redraw a large widget tree
          These numbers should be interesting on different setups

        - make queue_resize() and size_allocate() faster and reduce
          the amount of re-drawing on an otherwise unmodified gtk

        - see what effect configure compression has

> If you, with your opaque paned resize patch, can get to the 
> point where it never catches up with the mouse and
> sticks at the old location, I'd be very interested to
> know what GTK+ is doing while it is sticking.

I can't get to that point, but I *do* get a noticable lagging of the
handle behind the cursor.  I can even provoke that lagging (albeit a
much shorter one) on a PIII/800MHz with XFree86 3.3.6 (a vanilla
RedHat 6.2).

> Advantages queueing has over compression are:
> 
>  - safe, better compression - you frequently get the situation
>    where you can't safely compress event type A across event
>    (You can't compress exposes across configures safely), but
>    queueing allows you to compress as much as makes sense.
> 
>  - less complicated code
> 
>  - reading forward in the event queue in X is really inefficent.

What I have in my is really simple: instead of 
        
        - reading a configure
        - queueing a resize
        
I want to

        - read as many configures as possible
        - queue the last configure

This should not be too complicated.  Of course, it only works if
configures tend to arrive without other events intervening.

> > (At some point I introduced > a low priority idle loop to take care of
> > calls to g_object_unref - this really made a difference).
> 
> I assume you mean by this "to take care of freeing objects"  -
> if g_object_unref() is slower than adding something to a queue,
> we have a real big problem.

Yes, As Tim said, what I saw was most likely that g_object_unref()
destroyed the object.

> Even queued destruction sound a bit dubious good idea - I keep
> resizing my window and the memory usage keeps going up? 

I am not advocating that approach - if a lot of creation/destruction
can be avoided, that is better.  I just did the idle thing to get time
consuming stuff out of the interaction loop.

In general, I think it is important to keep in mind that interactions
tend to be short bursts of activity where ideally everything that
doesn't draw something on the screen should be delayed.  This
maximizes the amount of entertainment per second.

If the user is weird and keep resizing for extended periods of time,
the queued up stuff should start being run so that things don't get
delayed indefinitely.

> (GTK+-2.0 does some neat tricks to make sure that it never has
> to wait for responses from the server for expose events it 
> knows it is generating, and that's something I would be 
> very unhappy to give up.)

Could you point me to the code that does this?

> Note also that if someone calls queue_resize on a widget 
> explicitely, we guarantee that the widget will be redrawn
> completely, and we can't change that without breaking 
> compatibility. 

But hopefully there is no guarantee that an implicit resize triggers a
redraw of the entire widget (with children)?

> > That was my observation too.  Caching the GCs instead of creating and
> > destroying them made a significant difference.  Also caching of the
> > backing store pixmaps made a difference.
> 
> I don't think caching backing store pixmaps is a good idea - 
> that consumes a lot of server resources we don't need permanently.

If we had an idle handler to release the pixmaps, gtk would only
consume these resources while it was doing lots of drawing.  There
could be a (low) limit on the amount of pixmap-memory it was allowed
to consume at any point in time.

> > Opaque window resizing sucks currently because resizes are queued up
> > and not handled until the idle loop - essentially, Gtk is throwing
> > most of the ConfigureNotify events away instead of handling them.  My
> > guess would be that configure compression plus immediate resizes would
> > help, but I haven't tried it yet.
> 
> I couldn't see this effect at all on my machine, I did see (with my
> slower, earlier tests with debugging enabled) that I was triggering a
> pathology in the XFree86 scheduler on my machine. Try starting your
> server with -dumbSched and this behavior may go away.

Starting the server with -dumbSched did make most of the lagging go
away. Is this scheduler behaviour considered a bug in XFree86?  If
not, some sort of workaround is needed.

> Also, --disable-debug makes a big difference. Since we'll be
> shipping with debugging turned off unless we can fix the overhead,
> it may make sense to do performance with with --disable-debug.

Everything I have done so far has been with --disable-debug.

Søren




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]