Re: gdk_pixbuf interpolation tables

From: Brian Cameron <Brian Cameron Sun COM>
To: Brian Cameron Sun COM, otaylor redhat com
Cc: gtk-devel-list gnome org, James Cheng Sun COM, Paul Sandoz Sun COM, leo binchy Sun COM
Subject: Re: gdk_pixbuf interpolation tables
Date: Wed, 19 Feb 2003 10:54:19 +0000 (GMT)
Owen:

> Not having touched this code in a while, I had to spend some
> time going back through the code and the thought processes
> behind it. While doing that I wrote up some of the "math"
> behind it:
> 
>  http://people.redhat.com/otaylor/gtk/pixbuf-transform-math.ps
> 
> It's nothing very deep... just a little bit of symbol juggling
> really. But it makes it a little more obvious why 
> the filters we are actually separable.

Still, it is a very useful document.  Since it is only 64K, you might
consider including it in the gtk+ module as background info.

> I've also documented in there what the filters are that are being
> used.
> 
> The observation that all of the filters we use now are separable
> is an important one. (I had some inkling of this before as shown
> by how I defined the "BILINEAR" filter, but never really made the
> generalization). It certainly does reduce the complexity and
> expense of  calculating the tables a lot, and anything that makes
> pixops.c shorter and simpler (not to mention faster) is a good thing.
>
> Still, I wonder if pursuing this is really right in the long
> term:
> 
>  - Once calculating the filter tables gets to be a big
>    bottleneck, we've already are in a regime where our
>    algorithms aren't a good idea.

I should point out that the reason that I made this change had nothing
to do with performance.  I wanted to more tightly integrate the GTK+
scaling code with Sun's mediaLib library (which takes advantage of the
VIS multimedia speed up chip).  The most generic mediaLib scaling 
functions have the following limitations:

1. The size of the table is fixed to 2x2, 3x3 or 4x4.

2. mediaLib doesn't really have default TILE or hyperbolic filters
   that can be used.
   
However mediaLib does support more advanced scaling functions where
the user can specify their own table of any size.  Therefore using
these functions we can get scaling that is much more similar to the
way GTK+ itself works.

Unfortunately things weren't quite so simple.  These more advanced
mediaLib functions require that the interpolation table is specified
as two vectors rather than a full 2-D table.  Since the GTK+ tables
were separable anyway, I went ahead and did this work to make
mediaLib integrate with GTK+ more closely.

This may explain also why I was so careful to separate the logic
that was specific to converting the table to a table of integers into
the covert_vectors_to_table function rather than just including them
in the vectors directly, and why I separated overall_alpha evenly over
the two vectors.  

>    A well known defect of the current pixops code is that
>    it performs horribly for large scale down factors. 
>    Say, you are scaling a 1024x768 image to 64x48 for a
>    thumbnail. Then n_x = n_y = 9, and we compute
> 
>     9*9*16*16 == 20736
> 
>    filter coefficiencts. If the user hits the zoom-out button
>    a few more times, and the program tries to scale the
>    image down to 1x1, then we have:
> 
>      1025x1025x16x16 = 279M 
> 
>    filter coefficients. Oops!
> 
>    Of course, if we can compute the coefficients faster, 
>    we're better off, but not that much better off.

Right.  I think it would be reasonable to limit the maximum size of the
table to something reasonable.  Perhaps using 1-d vectors allows that
maximum size to be a bit larger since the math is slightly less heavy.
 
>  - The traditional 3D graphics way of doing downscaling
>    is to create a mipmap of scaled down images and then
>    do bilinear or trilinear interpolation from there.
>    Scaling down by a factor of two is something that 
>    we can code to be really fast, so this isn't unreasonable
>    even if we have to generate the mipmap on the fly.
> 
>    The main quality defect of this approach is when you
>    are scaling very by very different amounts in the two
>    directions, you lose detail. But anisotropic scaling is
>    going to be less common for our uses than it is in 3D
>    graphics since we aren't drawing lots of patterned floors 
>    receding off into the distance.
> 
>    You can also (like current 3D cards) use a larger
>    filter on top of the MIP-map to deal with the anisotropic
>    case.

Perhaps these could be added to GTK+ as new filters in addition
to TILES, BILINEAR, HYPER, and NEAREST rather than a replacement? 
 
>  - We get fairly frequent requests for rotation support
>    in gdk-pixbuf. When you switch from a simple scale, to
>    an arbitrary affine transform, the filter tables are 
>    no longer separable, and also aren't easily calculatable
>    analytically. So, a table based approach is a lot less
>    attractive.
> 
> So, I'm wondering if a complete rewrite of pixops.c to 
> a mipmap'ed approach may be the right long term approach.

Perhaps.  Though I think the current functions do have value for
the average scaling cases.
 
> A couple of detailed comments on your mail.
> 
> > 2. Separated tables can save memory.  It is only necessary to build
> >    the full 2d tables as they are accessed.  For example, most of
> >    the pixops_process functions deal with 1 line of tables.  Therefore
> >    the logic could easily be modified to only build each line of 
> >    tables as needed.  This has no extra CPU cost, but reduces memory
> >    used.  I did not yet implement this change, however I think it is
> >   clear that this would be fairly straightforward to do.
> 
> Building the tables on a as-needed bases doesn't save you much
> on "average", since for almost all scale factors, you'll need
> all the horizontal and vertical phases. Though the cases
> where the scale-down factor is a small integer fraction are
> certainly more common than others. (That is, to scale from
> 200 to 160, the scale factor is 5:4, so we only need 4 phases
> 0, 0.25, 0.50, 0.75 rather than all 16 phases.)

The only real advantage of building the tables as needed is that you
don't have to build the 2d table to begin with, saving the memory and
time.  However, I noticed that the table is accessed multiple times
in pixops_process (it is passed to process_pixel then to *line_func
and then to process_pixel again), so the code may need to be 
reorganized in order to really benefit from this sort of technique.
Obviously you lose the value of calculating the value on-the-fly
if you have to do the calculation multiple times.

Another advantage is that if you are only scaling in one dimention,
then you only need to build one vector!  The code to support this
isn't really there yet, but I think the concept is clear.
 
> I think it probably would work best to compute the needed phases
> in advance and pregenerate them rather than doing it on-demand,
> since it avoids complexity in the inner horizontal loop.

Yes, what could be done is just to build each line of tables as
needed.  Especially if reorganizing the code isn't possible.  
Each line could be built as it is accessed for the first time,
which would also save time.

> I'm a little wary of an approach, however, that works well
> for scaling from 256x256 to 64x64 but suddenly takes much
> longer and much more memory to scale to, say, 63x63.

Another way of looking at it would be that the code would run
faster for scaling some (256x256 or 64x64) but would run at the
current speed for others (63x63).  There is no real good reason
to penalize some scale operations by building the full table when
it is not needed.
 
> If we stick with a table based approach for all scale factors,
> one thing we might want to considers is simply computing each 
> 2-D filter coefficient as needed. If we convert the 1-D tables
> to integers and normalize them to sum correctly, than this
> should only require 1 multiplication per source pixel
> read, which isn't a huge deal. (We are doing ~20 arithmetic 
> operations per destination pixel for the compositing.)

Right a multiplication isn't really any more effort than an array
access lookup anyway.

Yes.  In other words, you are suggesting that the two vectors be
vectors of integers rather than doubles, so that you can just multiply
them together rather than having to do the "* 65536 + 0.5" for each
cell in the convert_vector_to_table function (or getting rid of
this convert function and doing the multiplications on the fly).

That said, I would like to mention the needs of mediaLib.  One nice
thing we can do with mediaLib is to use the vector arrays of doubles
directly rather than convert to integers at all.  This avoids needing
to call the "correct_totals" function in the mediaLib case since there
are no rounding errors.  Since mediaLib requires the vectors of doubles
and GTK+ requires integers, the logic could be made flexible for both
uses by doing the following:

1. introduce these errors outside of the *-make-weights function.
   Perhaps by converting the double vectors to integer vectors in a
   different function.
   
2. allow the *-make-weights functions to return both the double and
   integer vectors.  Since the double value needs to be computed anyway
   this wouldn't really be any extra work.
   
3. have the various *-make-weights functions take an argument to 
   specify whether the vectors should be returned as doubles or as
   integers.
   
4. the mediaLib patch could just #ifdef the code to make the 
   -*make-weights functions to return double arrays. 
   
I think I like solution #3 the best.  It's clean, simple & clear.
Solution #4 is my least favorite choice, since it would be nice if the
mediaLib code could just integrate nicely with the *-make-weights
functions without having to #ifdef them.  I think making the vector
of doubles available is a nice thing since it is uncorrupted by
rounding errors, and may be generally useful to future enhancements,
etc.

> > 3. The need to include overall_alpha in the structure is that it
> >    indicates what the tables sum to.  The vectors will always sum
> >    to overall_alpha.  This information is needed for later processing
> >    (correcting totals for 2d integer tables, or when scaling only
> >    in one dimension - see below for more information about this).
> 
> [...]
> 
> > 2. I put much thought into what to do with overall_alpha. 
> >    Spreading overall_alpha evenly over both dimensions is correct for
> >    all situations except for when scaling an image in one dimension
> >    only.  In this case you would obviously want to multiply that
> >    dimension by overall_alpha (when overall_alpha != 1.0), instead of
> >    sqrt(overall_alpha).  Note that GTK+ currently doesn't support
> >    scaling in one dimension, but setting up the vectors in this
> >    fashion takes into account the future evolution of the code.
> >    
> >    I ended up deciding that the cleanest solution is to evenly 
> >    distribute overall_alpha across the x/y dimension.  The one corner
> >    case mentioned above can easily be handled by multipling the 
> >    one vector used by the additional constant of sqrt(overall_alpha).
> >    
> >    An alternative solution that is equally correct is to force the
> >    *make-weights functions to *always* return tables that sum to 1.0
> >    (rather than to overall_alpha).  And then move the multiplication of
> >    overall_alpha to the conversion function.  This would be slighly
> >    slower, but might be considered more mathematically pure.
> 
> This is almost certainly the right thing to do. The overall alpha
> has nothing to do with the structure of the filter tables; it's
> just included in the filter tables as an optimization to speed
> things up. It might even make sense to drop the optimization entirely
> and do it when computing each pixel if that would simplify dealing
> with roundoff errors.

It's not clear to me whether you are suggesting that the tables should
sum to 1.0 or to overall_alpha as being "the right thing to do".  :)

> > Lastly I would like to discuss a few problems that I noticed with the
> > interpolation tables that are generated with INTERP_PIXOPS_HYPER.  These
> > problems exists equally in the current code and in my patch.  
> > 
> > 1. I notice that when using INTERP_PIXOPS_TILE and INTERP_PIXOPS_BILINEAR
> >    that the corrections range from -16 to +16.  This range of correction
> >    makes sense due to correcting for rounding errors.  This range of 
> >    32 is only .04% of 65536, so the correction will obviously not 
> >    greatly affect the end result.
> >    
> >    However, with INTERP_PIXOPS_HYPER the correction is often very large
> >    sometimes as high as 11943.  This is 18.22% of 65536, which is 
> >    obviously significant enough to cause significant distortion to the
> >    filter when the correction is not applied evenly to the filter.  The
> >    current correct_total() implementation only corrects the 1st value
> >    in the table it finds that can handle the correction without going
> >    negative.  I suspect that this creates unacceptable results, especially
> >    for a filter which is intended to be of highest quality (and is also
> >    the slowest).  I suspect that the problem could be corrected by 
> >    improving the logic of bilinear_make_weights.  However, the problem
> >    could also be addressed by distributing the correction more evenly
> >    across the filter.
> 
> If the HYPER table is that far off, it's just buggy, and we need to
> debug it.

Yes, I didn't have time to look into that problem.  I thought it made
sense to do one thing at a time, rather than trying to fix the HYPER
table at the same time as introducing separable filters.  :)  I do 
think there is a bug in the HYPER table if the totals are so far off.
Using my test program if you set DEBUG_PRINT_CORRECTION, then you can
see the large corrections print out to the screen.

> I haven't reviewed the details of your code yet. It looks reasonable
> in general, except for the overall_alpha problem and some of the
> function names. (E.g., bilinear_quadrant() should be renamed if it
> is dealing with half of the [0,1] interval rather than a quarter
> of the [0,1]x[0,1] square. :-)

Yes, that makes sense.

Speaking of function name confusion, our mediaLib expert had this
question about function naming in GTK+:

I am confused by the notation of 0888 and 8880 on function names
in this file.  First I thought 0 is used to mark the alpha channel.
But if that's the case, then gdk_rgb_convert_0888_br() and
gdk_rgb_convert_8880_br() should exchange their names 
 
> I think it probably makes sense to go ahead and apply some version
> of your patch; making the code faster never hurts, and making the
> code simpler will help future changes to actual address the important
> problems the code has.

Yes, since the work I did integrating mediaLib with GTK+ clearly also
would benefit the GTK+ code in general, I wanted to share it with you.
I think it also makes the code more readible, simple & clear.  These
are also bonuses for maintenance.

I hope that I have explained mediaLib's needs:

1. It needs the interpolation tables specified as 1-d vectors rather
   than the 2-d tables.
   
2. It requires that the tables are tables of doubles that sum to either
   1.0 or overall_alpha.  If the tables sum to 1.0, then the mediaLib
   patch will simply have to convert the vectors to sum to overall_alpha
   which is a simple thing to do.  The time savings of using mediaLib
   greatly outweigh the time it takes to convert the 1-d vector in this
   fashion.
   
Obviously my patch was geared towards providing the above for mediaLib.
I understand that GTK+'s needs are slightly different since it works
with tables of integers.  As I said above, I hope that the various
*-make-weights functions can be tuned to return a structure that meets
both mediaLib and GTK's needs, if at all possible.

As I said above, I think making the *-make-weights functions optionally
return the double vectors is a useful feature since these vectors are
uncorrupted with rounding errors.  While GTK+ itself may not currently
have a use for this, it is a nice option to have around for future
enhancements, etc. (aside from the fact that it makes GTK+ integrate
more nicely with mediaLib).

Have you ever considered making the scale/composite functions allow
the user to specify their own interpolation table/vectors?

Brian
Follow-Ups:
- Re: gdk_pixbuf interpolation tables
  - From: Owen Taylor
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]