Re: [cairo] Better coverage from cairo performance suite (and some results)



On Thu, 05 Oct 2006 11:32:37 -0700, Carl Worth wrote:
> 
> Another tool that will be helpful to have is something for doing
> historical comparison over several runs. The simplest tool, and very
> useful, would be a "performance diff" that takes two runs and reports
> the difference, (perhaps only showing tests where the results differ
> more than a single standard deviation). I would use that kind of tool
> constantly to ensure that submitted patches to provide desired
> performance improvements.

I've written that program now. It's called cairo-perf-diff and it's
built in the cairo/perf directory. The Makefile won't install it or
anything, as I figure it's easy enough for interested people to just
manually copy it to ~/bin or whatever.

The interesting part of making this program work well is in what it
_doesn't_ show. Currently it is discarding as uninteresting any change
for which the mean values are not separated by more than 3 of the
standard deviation of each.

Ideally, that's the only kind of discarding we would do, but it's not
quite working well enough yet. So, currently I'm also discarding any
changes below a given threshold, (5% by default, but can also be
specified as the third argument on the command line).

Even then, it's still not discarding all the noise. There's a really
easy test for this. Just run cairo-perf twice (saving the output from
each run into first.perf and second.perf) and then run:

	cairo-perf first.perf second.perf 0.0

(That third 0.0 forces it to only discard based on overlapping
probability distributions based on the 3 standard deviations---and not
too discard things based on the percentage change being too small.)

If everything were working correctly, the output from the above would
be empty, since there should be no interesting changes in the
performance results, (and any variation should be captured by the
reported standard deviations). But the results aren't empty yet.

I did some things to attempt to improve this already. For example, I've
made cairo-perf output the number of ticks it measures in addition to
the time in milliseconds it estimates, (based on an estimate of the CPU
frequency that it measures). So cairo-perf-diff computes only on the
ticks columns, (but puts the time in its output for readability).

I think other problems are the fixed-percentage outlier elimination
and early bailout based on a stably low standard deviation. I think
these prevent the standard deviation from capturing the true amount of
variation. I started some work to eliminate the early bailout and to
do adaptive outlier detection, (based on the conventional "1.5 times
the interquartile range above the third quartile or below the first
quartile" http://mathworld.wolfram.com/Outlier.html ).

I haven't succeeded at making great improvements along those lines,
(particularly in light of the fact that removing the early bail out
slows things down a lot). And I really need to start using this tool
to land cairo patches rather than develop it. So if anyone else wants
to improve things to try to get the command above to report nothing,
then that would be greatly appreciated.

In the meantime, here's a sample showing what the output can look
like. Here's what cairo-perf-diff gives me when I give it the results
of cairo-perf before and after the patch that Monty provided for
fixing the subimage_copy performance bug in cairo:

-Carl

Speedups
========
 xlib-rgba              subimage_copy-512    3.93 2.46% ->   0.07 2.71%: 52.91x faster
███████████████████████████████████████████████████▉
 xlib-rgb               subimage_copy-512    4.03 1.97% ->   0.09 2.61%: 44.74x faster
███████████████████████████████████████████▊
 xlib-rgba              subimage_copy-256    1.02 2.25% ->   0.07 0.56%: 14.42x faster
█████████████▍
 xlib-rgba        text_image_rgb_over-256   63.21 1.53% ->  11.87 2.17%:  5.33x faster
████▍
 xlib-rgba       text_image_rgba_over-256   62.31 0.72% ->  11.87 2.82%:  5.25x faster
████▎
 xlib-rgba     text_image_rgba_source-256   67.97 0.85% ->  16.48 2.23%:  4.13x faster
███▏
 xlib-rgba      text_image_rgb_source-256   68.82 0.55% ->  16.93 2.10%:  4.07x faster
███▏
 xlib-rgba              subimage_copy-128    0.19 1.72% ->   0.06 0.85%:  3.10x faster
██▏
 xlib-rgb         text_image_rgb_over-256  108.22 0.40% ->  57.47 0.37%:  1.88x faster
▉
 xlib-rgb        text_image_rgba_over-256  107.32 0.59% ->  57.32 0.78%:  1.87x faster
▉
 xlib-rgb       text_image_rgb_source-256  114.92 0.44% ->  61.73 0.79%:  1.86x faster
▉
 xlib-rgb      text_image_rgba_source-256  114.01 0.51% ->  61.69 0.51%:  1.85x faster
▉
 xlib-rgba              subimage_copy-64     0.11 2.24% ->   0.06 0.73%:  1.83x faster
▉
 xlib-rgb               subimage_copy-256    2.81 1.57% ->   1.65 1.19%:  1.71x faster
▊
 xlib-rgba        text_image_rgb_over-128    4.78 2.22% ->   2.85 1.06%:  1.68x faster
▋
 xlib-rgba       text_image_rgba_over-128    4.72 1.38% ->   2.83 0.92%:  1.67x faster
▋
 xlib-rgba      text_image_rgb_source-128    5.82 0.22% ->   3.92 0.57%:  1.48x faster
▌
 xlib-rgba     text_image_rgba_source-128    5.79 0.25% ->   3.93 1.56%:  1.47x faster
▌
 xlib-rgba       text_image_rgba_over-64     1.53 1.03% ->   1.13 0.42%:  1.35x faster
▍
 xlib-rgba        text_image_rgb_over-64     1.52 0.45% ->   1.13 1.15%:  1.34x faster
▍
 xlib-rgb               subimage_copy-64     0.25 1.04% ->   0.19 2.61%:  1.34x faster
▍
 xlib-rgb               subimage_copy-128    0.64 1.65% ->   0.50 1.09%:  1.27x faster
▎
 xlib-rgba      fill_radial_rgba_over-256    9.75 0.95% ->   7.81 2.55%:  1.25x faster
▎
 xlib-rgba        fill_image_rgb_over-256    2.56 0.77% ->   2.07 1.49%:  1.24x faster
▎
 xlib-rgba       fill_image_rgba_over-256    2.55 0.41% ->   2.06 1.01%:  1.23x faster
▎
 xlib-rgba      text_image_rgb_source-64     2.27 0.91% ->   1.88 0.20%:  1.21x faster
▎
 xlib-rgba       fill_radial_rgb_over-256    9.68 0.60% ->   8.17 0.51%:  1.18x faster
▏
 xlib-rgba     fill_image_rgba_source-256    3.95 2.11% ->   3.35 1.51%:  1.18x faster
▏
 xlib-rgba              subimage_copy-32     0.07 1.57% ->   0.06 0.91%:  1.17x faster
▏
 xlib-rgba     text_image_rgba_source-64     2.25 0.28% ->   1.92 1.57%:  1.17x faster
▏
 xlib-rgba      fill_image_rgb_source-256    3.85 0.39% ->   3.32 1.20%:  1.16x faster
▏
 xlib-rgb         text_image_rgb_over-64     4.60 2.34% ->   4.06 0.51%:  1.13x faster
▏
 xlib-rgb         text_image_rgb_over-128   16.05 1.57% ->  14.24 1.86%:  1.13x faster
▏
 xlib-rgb       text_image_rgb_source-128   17.20 2.02% ->  15.32 1.76%:  1.12x faster
▏
 xlib-rgb        text_image_rgba_over-64     4.54 0.71% ->   4.11 1.08%:  1.10x faster
▏
 xlib-rgb       text_image_rgb_source-64     5.03 0.35% ->   4.59 0.16%:  1.10x faster
▏
 xlib-rgba       fill_image_rgba_over-64     0.36 1.78% ->   0.33 0.61%:  1.09x faster
▏
 xlib-rgb      text_image_rgba_source-64     4.99 0.20% ->   4.61 0.49%:  1.08x faster
▏
 xlib-rgb               subimage_copy-32     0.11 1.24% ->   0.10 1.13%:  1.07x faster
▏
 xlib-rgba     fill_radial_rgb_source-128    2.54 0.44% ->   2.38 0.31%:  1.07x faster
▏
 xlib-rgba     fill_image_rgba_source-64     0.48 0.65% ->   0.45 0.58%:  1.07x faster
▏
 xlib-rgba       fill_radial_rgb_over-128    2.19 0.33% ->   2.06 1.00%:  1.06x faster
▏
 xlib-rgba      fill_image_rgb_source-64     0.48 0.60% ->   0.45 0.72%:  1.06x faster

Slowdowns
=========
 xlib-rgba  paint_similar_rgba_source-256    0.12 2.52% ->   0.16 2.81%:  1.33x slower
▍
image-rgba    paint_image_rgba_source-256    0.08 0.39% ->   0.10 2.45%:  1.25x slower
▎
image-rgba  paint_similar_rgba_source-256    0.09 0.38% ->   0.10 2.35%:  1.20x slower
▎
image-rgb        paint_solid_rgb_over-512    0.64 1.12% ->   0.74 1.57%:  1.17x slower
▏
image-rgb     paint_solid_rgba_source-512    0.64 1.21% ->   0.74 0.44%:  1.17x slower
▏
image-rgb      paint_solid_rgb_source-512    0.64 0.93% ->   0.74 0.59%:  1.16x slower
▏
image-rgb     paint_radial_rgb_source-512   53.05 2.18% ->  60.76 2.07%:  1.15x slower
▏
 xlib-rgba       text_radial_rgb_over-64     3.95 0.57% ->   4.48 1.09%:  1.14x slower
▏
image-rgba    paint_solid_rgba_source-512    0.66 1.65% ->   0.73 1.10%:  1.12x slower
▏
image-rgba     paint_solid_rgb_source-512    0.66 1.90% ->   0.73 0.74%:  1.11x slower
▏
image-rgb   paint_similar_rgba_source-256    0.26 1.09% ->   0.29 0.98%:  1.11x slower
▏
image-rgb       fill_radial_rgba_over-256    5.57 0.30% ->   6.11 0.24%:  1.10x slower
▏
image-rgb      paint_radial_rgba_over-512   55.79 1.42% ->  60.80 0.68%:  1.09x slower
▏
image-rgb     fill_radial_rgba_source-128    1.64 0.20% ->   1.78 0.15%:  1.09x slower
▏
image-rgb     fill_radial_rgba_source-256    6.02 0.49% ->   6.55 0.26%:  1.09x slower
▏
image-rgb       fill_radial_rgba_over-128    1.54 1.08% ->   1.66 0.15%:  1.07x slower
▏
image-rgb     fill_radial_rgba_source-64     0.56 0.47% ->   0.60 0.46%:  1.07x slower
▏
image-rgb      paint_image_rgb_source-256    0.08 0.39% ->   0.09 0.78%:  1.06x slower

image-rgb       fill_radial_rgba_over-64     0.53 0.14% ->   0.56 0.47%:  1.06x slower

 xlib-rgba     fill_radial_rgb_source-64     0.83 0.38% ->   0.88 0.46%:  1.05x slower

Attachment: pgphj1ki8D0vH.pgp
Description: PGP signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]