Re: Performance change from X in Fedora Core 4 to Fedora Core 5

From: Owen Taylor <otaylor redhat com>
To: Clemens Eisserer <linuxhippy gmail com>
Cc: gtk-devel-list gnome org, xorg freedesktop org
Subject: Re: Performance change from X in Fedora Core 4 to Fedora Core 5
Date: Mon, 10 Jul 2006 11:06:48 -0400

[ Switching Cc: from gtk-list to gtk-devel-list ]

On Mon, 2006-07-10 at 16:27 +0200, Clemens Eisserer wrote:
> Hi Owen,
> 
> > It's very likely that the speedup here is simply due to variations as
> > to what gets put into video memory and what into system memory.
> 
> I talked to nvidia driver developers about this issue and they told me
> that this is the case (I also saw a lot of time spent in malloc
> routines). This initially brought me to work in this a bit.
> They said that they'll include an option to disable pixmap-punting,
> however then a lot fo time will be spent inside the driver and the GPU
> for allocating the pixmap in VRAM.
> Its quite common that such surface-puntnig optimizations are in the
> drivers - I would not wonder if ATI would do the same in their
> propietary drivers.
> Thats why I would be really interrested about results on ATI chips
> (open and proprietary drivers) as well as onboard gpus.

I think you misunderstood me; I'm not saying that time spent allocating
pixmaps to video ram is a problem, I'm saying that the performance of
Cairo will be very different to video ram as compared to system ram, and
allocating many megabytes of persistent double-buffers, can easily
change what gets in system ram and what in video ram.

Now, maybe the nVidia drivers/hardware have problems with allocating
pixmaps; my expectation and experience is that allocating a pixmap is
much less expensive than the rest of an expose operation; but if the 
nvidia drivers do something heavyweight like flushing the GPU pipeline,
who knows.

(As a general philosophy: it's hard to optimize GTK+ for a black box
like the proprietary nVidia drivers; I think GTK+ should do things that
work well with the open source drivers, and let nVidia optimize for
GTK+ if their drivers have problems with that.)

> But when keeping in mind that sexpose should happen in not more than
> 50-75ms I still think allocating and deallocating 12mb of pimaps
> (gftp) for a single expose is simply too expensive. 

I very much doubt that time for allocating pixmaps is proportional to
the number of pixmaps in them.

> Also, many double-buffered toolkits do it the same way - Java Swing does not even
> release the pixmaps if windows are minimize.

Not a convincing argument as to why GTK+ should do something.

You have to keep in mind that many systems still don't have a lot
of video memory, so if all windows in the system start keeping large
back buffers, the effect may well be exhaustion of all available
video memory.

> I would like to see a design which maybe would allow different
> approaches (maybe based on user descisions), so everybody would have
> the choice.

While it's cool you are trying out different ways to do things, I think
putting any code in GTK+ in advance of more careful performance
investigation would be a mistake. Try some micro benchmarks: how long
does allocating pixmaps take? Does that depend on size? Try your
gtk-bench tests with no other windows open, then try them again after
allocating 100MB of 100x100 pixmaps to run yourself out of video memory,
etc.

(Make sure to shut your browser before doing tests like this, Firefox is
not shy about consuming video memory)

						Owen

P.S. 

- using a single pixmap to do exposes for an entire set of subwindows of
a toplevel window does have some advantages, like better
redisplay, so it could be interesting to investigate, even if keeping
persistent double buffers around is a bad idea.

- in the composited architecture, there already *are* persistent double
buffers for all toplevel windows, so investigating using those in some
way could be interesting. What you have to avoid is the compositing
manager doing tons of extra work in this case.

Attachment: signature.asc
Description: This is a digitally signed message part

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]