Re: Doubts about GPeriodic



On Fri, 2010-10-22 at 19:30 -0400, Havoc Pennington wrote:
> Hi,
> 
> On Fri, Oct 22, 2010 at 4:48 PM, Owen Taylor <otaylor redhat com> wrote:
> > I think we're largely agreeing on the big picture here - that priorities
> > don't work so there has to be arbitration between painting and certain
> > types of processing.
> 
> Right, good. The rest is really just details - there are various ways
> it could work.
> 
> As I wrote this email I realized I'm not 100% clear how you propose
> the 50/50 would work, so maybe it's something to spell out more
> explicitly. There's no way to know how long painting will take right,
> so it's a rule for the "other stuff" half? Do you just mean an
> alternative way to compute the max time on non-painting tasks (half of
> frame length, instead of 5ms or "until frame-complete comes back")?

I hadn't really worked it out to the point of an algorithm, but let me
see if I can take a stab at that.

My starting point is that:

 - We should not start painting the next frame until we are notified
   the last frame is complete.

 - Once we are notified the last frame is complete, if the 
   "other stuff" queue is empty, we should start painting the next
   frame immediately - we shouldn't hang around waiting just in case
   something shows up.

So the question is how long after frame completion we should keep on
processing "other stuff" before we start painting the frame. The target
here is the 50% rule - that we want to roughly balance the time to paint
the frame with the time that we spend processing everything else before
processing the frame.

The simplest technique we could take is to say that when we have
contention processing "other stuff" is limited to 0.5 / (refresh rate)
seconds (roughly 8ms for the standard 60hz refresh.) This works out
pretty well until the paint time gets "big". Picking a bunch of
arbitrary data points:

 paint time    other time      fps   work fraction
 ==========    ==========      ===   =============
 1ms           15ms             60   94%
 8ms            8ms             60   50%
 10ms          22ms             30   68%
 17ms          15ms             30   47%
 20ms          12ms             30   38%
 24ms           8ms             30   33%
 40ms          10ms             20   20%
 55ms          11ms             15   20%
 90ms          10ms             10   10% 

But what this does mean is that there is a cliff across different
systems here that's even worse than it looks from above. Take a very
non-extreme example - if I'm testing my app on my laptop, maybe painting
is taking 20ms, and I'm getting a reasonable 30fps. I give it to someone
with a netbook where CPU and GPU are half the speed and painting takes
40ms. The framerate drops only to 20fps but the time for a background
operation to finish increases by 3.8x. The netbook user has half the CPU
and we're using only half that half to do the background work.

(This type of thing can happen not just because of a slow system, but
because of other different conditions - the user has more data, has a
bigger screen, etc. The less predictable the situation, the more we need
to make sure that things degrade gracefully. "GTK+ application running
on a user's system" is a pretty unpredictable situation.)

So, there's some appeal to actually base it on measured frame times.
Using just the last frame time is not a reliable measure, since frame
painting times (using "painting" to include event process and relayout)
are very spiky. Something like:

 - Average time over last three frames
 - Minimum time over last three frames
 - Average time over last three frames where only motion events were
   delivered

Probably works better. Once you have a historical frame time estimate,
then you limit the total "other stuff" time (before and after frame
completion) to that time.

 paint time    other time      fps   work fraction
 ==========    ==========      ===   =============
 1ms           15ms             60   94%
 8ms            8ms             60   50%
 10ms          22ms             30   68%
 17ms          33ms             20   65%  (30 47%)
 20ms          30ms             20   60%  (30 38%)
 24ms          26ms             20   52%  (30 33%)
 40ms          60ms             10   60%  (20 20%)
 55ms          61ms            8.6   52%  (15 20%)
 90ms          93ms            5.5   51%  (10 10%)

> > But pathological or not, I think it's also common. This is where my
> > suggestion of a 50% rule comes in. It's a compromise. If we're lucky
> > repainting is cheap, we're hitting the full frame rate, and we're also
> > using 75% of the cpu to make progress. But when drawing takes more time,
> > when there is real competition going on, then we don't do worse than
> > halve the frame rate.
> >
> > (This continues to hold in the extreme - if redrawing is *really* slow -
> > if redrawing takes 1s, then certainly we don't want to redraw for 1s, do
> > 5ms of work, redraw for another 1s, and so forth. Better to slow down
> > from 1 fps to 0.5fps than to turn a 1s computation into a 3 minute
> > computation.)
> 
> Let me think about this in terms of litl shell, which is the
> real-world example I'm thinking of, and maybe we can see how
> gnome-shell or some other GL-using apps differ which could be
> instructive.

[..]

> I guess gnome-shell is similar except less "stuff"

Basically yes. I can't see us ever doing video in gnome-shell - we just
have to be able to smoothly composite video someone else is playing.
But I'm not really thinking about gnome-shell here, I'm really thinking
more about a "standard" application .... perhaps Evolution or Rhythmbox.
Whether written in GTK+ or in Clutter or some hybrid.

As compared to a compositing shell, various differences:

 - Not the compositor; if it stutters, it won't make your TV show
   drop frames. 

 - May need to handle very large lists of items. (Very large lists
   of items are basically GUI bugs, but they come up in most real
   world applications.

 - Written with a focus on the application operation - first, get it to
   do the right thing. Then start trying to fix performance problems.

> As you say in the followup mail, at some point multiple
> processes/threads exist for a reason.
> 
> Agreed, but in litl shell there's only one thing I think is in-process
> that shouldn't be, which is the photo app, and one thing is out of
> process that shouldn't be (GTK widgets). It's just a complex app
> talking to a lot of other processes. All the main shell really does is
> coordinate processes and paint an assemblage of the other processes. I
> don't know, I would think it's basically the same deal as the "main"
> Chrome process with every tab out of process, or as the main Eclipse
> process where Eclipse has lots of threads, or whatever. The main shell
> doesn't do blocking IO or long computations. It does have loads of IPC
> queues to talk to all the threads and processes. I almost feel like
> the threads and processes are the whole reason we have queue-based
> GSource.
> 
> It almost seems like this is my prototypical case, where there *isn't*
> any computation in the main thread, just lots of queues to dispatch,
> and the case you're worried about most is where there *is* ... ?

I think I'm trying to take a broad view - we want to work well for the
compositor, but we also have to work well for the application putting
100,000 rows into a GtkTreeView because someone opened their
gtk-devel-list folder with messages back to 2000.

> On Radeon hardware with indirect rendering, litl shell paint takes in
> the 7-9ms area.

This is the type of computation that I don't think most application
authors have the luxury of making. Available graphics hardware can have
a factor of 10 performance range. Performance of a single CPU core can
vary by a factor of 2 or 3. And screen size and data size can widely
vary as well.

>  So for 60fps (16.6ms per frame) you have about 5ms per
> frame leftover with a little headroom. On 50fps then you have more
> headroom.

This may indicate a need to have tuning parameters. A knob to turn
between not starving background processes and not dropping frames.

[...]

> Ignoring the specifics, takeaways could be:
>  * there's a cliff in the chosen time to spend not-painting where we
> make ourselves miss the vsync
>  * there's a cliff in the theoretical framerate your app can achieve,
> where if you take longer than a frame to paint you must drop frames,
> and if you take less than a frame you can in theory avoid dropping
> frames - as long as you limit non-painting activities "enough"
>  * both cliffs depend on the hardware, its refresh rate, and on how
> long the app takes to paint on the hardware

I think the addition I'd  make here is that as you push the edge of the
cliff you get into a starvation situation - saying that not dropping
frames is an absolute hard priority can run you into difficulties just
like most other types of absolute hard priorities.

> I kinda think most apps would prefer to achieve the best framerate
> possible as limited by their painting, possibly slowing down
> non-painting activities a fair amount "within reason." That is it
> kinda makes sense to avoid the cliffs.

I *don't* think applications would accept that filling a treeview takes
twice as long (or 10x as long) so that the scrollbar can be painted
expanding at a smooth 60fps.

> Unless it doesn't, I mean, in your treeview example the expander
> animation is unimportant enough that maybe the user would rather fill
> the treeview up faster.

The expander animation really isn't my concern. It will end soon enough
no matter how many items the treeview has. My concern is the scrollbar -
the scrollbar will continually need repainting until the treeview is
full.

- Owen




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]