Re: [cairo] Better coverage from cairo performance suite (and some results)



Hello Carl,

> Bad stuff:
> 
>  - Gradient computation is *really* slow, (which we did know about
>    already at least). But now we know how slow. In the time it takes
>    to compute a gradient image, we could have painted that image 15
>    times (for linear gradients) or 30 time (for radial gradients).
> 

For the record, the performance patchset I already sent you speeds up gradient
rendering by nearly a factor of 2. If you're still interested, I've uploaded
an updated version which corresponds to the current git repository at the
following address:

  http://david.freetype.org/cairo/cairo-performance-1.patchset

on my machine I get the following numbers with the current git code:

  [ 13]    image-rgba            paint_linear_rgb_source-64      0.443  0.05%   100
  [ 14]    image-rgba             paint_linear_rgba_over-64      0.486  0.09%   100
  [ 15]    image-rgba           paint_linear_rgba_source-64      0.443  0.04%   100
  [ 16]    image-rgba              paint_radial_rgb_over-64      0.901  0.31%   100
  [ 17]    image-rgba            paint_radial_rgb_source-64      0.905  0.36%   100
  [ 18]    image-rgba             paint_radial_rgba_over-64      0.916  0.25%   100
  [ 19]    image-rgba           paint_radial_rgba_source-64      0.881  0.26%   100

  [ 72]    image-rgba              paint_linear_rgb_over-512    28.773  3.20%   100
  [ 73]    image-rgba            paint_linear_rgb_source-512    28.487  2.97%   100
  [ 74]    image-rgba             paint_linear_rgba_over-512    31.145  2.96%   100
  [ 75]    image-rgba           paint_linear_rgba_source-512    27.860  0.60%   100
  [ 76]    image-rgba              paint_radial_rgb_over-512    58.246  0.44%   100
  [ 77]    image-rgba            paint_radial_rgb_source-512    58.137  0.46%   100
  [ 78]    image-rgba             paint_radial_rgba_over-512    59.015  0.62%   100
  [ 79]    image-rgba           paint_radial_rgba_source-512    61.295  1.33%   100

and once patched:

  [ 12]    image-rgba              paint_linear_rgb_over-64      0.232  0.02%   100
  [ 13]    image-rgba            paint_linear_rgb_source-64      0.233  0.01%   100
  [ 14]    image-rgba             paint_linear_rgba_over-64      0.275  0.02%   100
  [ 15]    image-rgba           paint_linear_rgba_source-64      0.234  0.01%   100
  [ 16]    image-rgba              paint_radial_rgb_over-64      0.723  0.53%   100
  [ 17]    image-rgba            paint_radial_rgb_source-64      0.713  0.40%   100
  [ 18]    image-rgba             paint_radial_rgba_over-64      0.724  0.28%   100
  [ 19]    image-rgba           paint_radial_rgba_source-64      0.688  0.22%   100

  [ 72]    image-rgba              paint_linear_rgb_over-512    13.983  0.50%   100
  [ 73]    image-rgba            paint_linear_rgb_source-512    13.881  0.59%   100
  [ 74]    image-rgba             paint_linear_rgba_over-512    16.591  0.54%   100
  [ 75]    image-rgba           paint_linear_rgba_source-512    13.873  0.65%   100
  [ 76]    image-rgba              paint_radial_rgb_over-512    45.823  0.40%   100
  [ 77]    image-rgba            paint_radial_rgb_source-512    45.724  0.47%   100
  [ 78]    image-rgba             paint_radial_rgba_over-512    45.958  0.40%   100
  [ 79]    image-rgba           paint_radial_rgba_source-512    47.425  0.47%   100

As you'll notice, linear gradients benefit the most from the speed up, since radial
ones are still handicapped by the very slow math computations they need to do.

I also expect the patched code to be much faster than the original one on embedded
processors like the ARM; mainly because I removed one division per pixel (the ARM
doesn't have a division operator, everything is done in software) through a simple
multiply-by-inverse trick.

it'd be interesting to modify the pixman code even further in the following
directions:

- do not generate a big square/rectangular gradient image before compositing
  with the real source mask.

- avoid 64-bit computations when the input parameters allow it

I'll try to see what I can do there, but this is certainly less trivial

Note that the patch also improves slightly tesselation performance (thanks to inline
assembly of one single function, which means it doesn't fix the edge case computation
errors in the original code):

  [260]     xlib-rgb                       tessellate-16-100     0.159  0.25%   100
  [261]     xlib-rgb                       tessellate-64-100    12.664  0.63%   100
  [262]     xlib-rgb                      tessellate-256-100   948.374  0.09%   100

becomes:

  [260]     xlib-rgb                       tessellate-16-100     0.131  0.24%   100
  [261]     xlib-rgb                       tessellate-64-100     9.234  0.56%   100
  [262]     xlib-rgb                      tessellate-256-100   707.965  0.16%   100

though this is probably much more modest than your gains with the new tesselator.

Hope you'll find this useful, and that you'll find the time to integrate this to
the Cairo git repository.

Regards,




- David Turner
- The FreeType Project  (www.freetype.org)




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]