weird GdkPixmap performance in win32



I have been playing around a lot with X, gtk and GDI to explore what kind of performance can be achieved with the basic ubiquitous APIs on the respective platforms, and have made a few interesting discoveries which I thought I'd share and ask about.

First, some background. The use case I am fixated on is optimising performance of a window that scrolls in both dimensions, so that scrolls are as smooth as possible. Without babbling about it too much, the best solution I have come up with to this problem is to have an array of pixmap tiles for the visible part of the viewport, and create and paint the tiles as they come into view. With this problem in mind I created a few programs to benchmark various approaches, using xlib on linux, GDI on windows and gtk-- on both. The code is here:

http://knobbits.org/archived/2009-11/pixmap-benchmark.tar.gz

("make -k" to compile the binaries that will compile on each platform. Also, please excuse the poor excuse for OOP. It seemed like a good idea before I started coding. Lastly, yes it's in gtk-- but I presume this all applies to gtk+ too. They only work in 32-bit color modes, sorry!)

The thing I mainly want to talk about is performance of blitting of server-side images (Bitmaps in GDI, XPixmaps in xlib, GdkPixmaps in GDK).

- (in case you didn't already know) with the right hardware and drivers they are *fast*. With large enough pixmaps my laptop with an nvidia Quadro NVS 135M approaches about 600Mpix/sec, and my desktop with an nvidia 9800 GT can get to about 4Gpix/sec. I wonder if maybe, even though it *looks* like it's doing what it's meant to, the card/driver is just *pretending* to do the blit half the time to fool me. (it is running quite a bit faster than the display refresh rate)

- In linux/X at least, the xlib and gtk versions compare pretty much equally. In particular the overhead of each image blit (measurable by benchmarking blits of tiny images, say 8x8) is very similar between my bare-bones (ish) xlib program and gtk--. Sweet.

- In Windows it gets more interesting. For the GDI version I tried both CreateBitmap and CreateCompatibleBitmap. images created with CreateCompatibleBitmap would reach 4Gpix/sec but images created with CreateBitmap would only reach about 400Mpix/sec, even though their pixel data format is identical. win32 gtk's GdkPixmap performance was also in the 400Mpix/sec range. I have no idea what's going on in the nvidia drivers.

- Also on my desktop the win32 gtk version does not perform consistently - jerks are visible while the program runs. I was able to ostensibly replicate this behavior in the windows GDI program by creating and destroying a separate DC for each image blit (try "testwingdi -t3"). This suggests that creation (or destruction?) of DCs has erratic performance.

Does anyone know the reasons for what I'm seeing? Does it happen in other hardware? I would be interested to know what happens on other hardware.

If my gtk-- prog is met with enough hostility I'll rewrite it in plain C/gtk+.

My wish is to achieve that 4Gpix/sec performance in gtk on both windows and linux, without there being anything too platform-specific in the app code.

FYI, the linux tests were run on ubuntu jaunty (gtk+ 2.16.1) and in windows XP SP3 with 2.16.something. (what is gtkmm 2.16.0-4?)

Mick.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]