# Re: gdk_pixbuf interpolation tables

• From: Owen Taylor <otaylor redhat com>
• To: Brian Cameron <Brian Cameron Sun COM>
• Cc: gtk-devel-list gnome org, James Cheng Sun COM, Paul Sandoz Sun COM, leo binchy Sun COM
• Subject: Re: gdk_pixbuf interpolation tables
• Date: 19 Feb 2003 14:47:57 -0500

On Wed, 2003-02-19 at 05:54, Brian Cameron wrote:
> Owen:
>
> > Not having touched this code in a while, I had to spend some
> > time going back through the code and the thought processes
> > behind it. While doing that I wrote up some of the "math"
> > behind it:
> >
> >  http://people.redhat.com/otaylor/gtk/pixbuf-transform-math.ps
> >
> > It's nothing very deep... just a little bit of symbol juggling
> > really. But it makes it a little more obvious why
> > the filters we are actually separable.
>
> Still, it is a very useful document.  Since it is only 64K, you might
> consider including it in the gtk+ module as background info.

Yeah, I was planning to do that.

> > I've also documented in there what the filters are that are being
> > used.
> >
> > The observation that all of the filters we use now are separable
> > is an important one. (I had some inkling of this before as shown
> > by how I defined the "BILINEAR" filter, but never really made the
> > generalization). It certainly does reduce the complexity and
> > expense of  calculating the tables a lot, and anything that makes
> > pixops.c shorter and simpler (not to mention faster) is a good thing.
> >
> > Still, I wonder if pursuing this is really right in the long
> > term:
> >
> >  - Once calculating the filter tables gets to be a big
> >    bottleneck, we've already are in a regime where our
> >    algorithms aren't a good idea.
>
> I should point out that the reason that I made this change had nothing
> to do with performance.  I wanted to more tightly integrate the GTK+
> scaling code with Sun's mediaLib library (which takes advantage of the
> VIS multimedia speed up chip).  The most generic mediaLib scaling
> functions have the following limitations:

To be really blunt, I don't care about medialib.

If we can improve the gdk-pixbuf code and that happens to also
help medialib integration, great.

But restructuring gdk-pixbuf to make it possible to backend off
of a closed source library has very small interest for me.

Any major changes have be justified in terms of what they do for
the code shipped with gdk-pixbuf.

> > > 2. Separated tables can save memory.  It is only necessary to build
> > >    the full 2d tables as they are accessed.  For example, most of
> > >    the pixops_process functions deal with 1 line of tables.  Therefore
> > >    the logic could easily be modified to only build each line of
> > >    tables as needed.  This has no extra CPU cost, but reduces memory
> > >    used.  I did not yet implement this change, however I think it is
> > >   clear that this would be fairly straightforward to do.
> >
> > Building the tables on a as-needed bases doesn't save you much
> > on "average", since for almost all scale factors, you'll need
> > all the horizontal and vertical phases. Though the cases
> > where the scale-down factor is a small integer fraction are
> > certainly more common than others. (That is, to scale from
> > 200 to 160, the scale factor is 5:4, so we only need 4 phases
> > 0, 0.25, 0.50, 0.75 rather than all 16 phases.)
>
> The only real advantage of building the tables as needed is that you
> don't have to build the 2d table to begin with, saving the memory and
> time.  However, I noticed that the table is accessed multiple times
> in pixops_process (it is passed to process_pixel then to *line_func
> and then to process_pixel again), so the code may need to be
> reorganized in order to really benefit from this sort of technique.
> Obviously you lose the value of calculating the value on-the-fly
> if you have to do the calculation multiple times.
>
> Another advantage is that if you are only scaling in one dimention,
> then you only need to build one vector!  The code to support this
> isn't really there yet, but I think the concept is clear.
>
> > I think it probably would work best to compute the needed phases
> > in advance and pregenerate them rather than doing it on-demand,
> > since it avoids complexity in the inner horizontal loop.
>
> Yes, what could be done is just to build each line of tables as
> needed.  Especially if reorganizing the code isn't possible.
> Each line could be built as it is accessed for the first time,
> which would also save time.

But you don't actually always need all the filters for a complete
line ... remember, though code computes 16x16 filter matrices
for 16 phases horizontally and vertically. If you are scaling
horizontally by an integer factor, then you only need one horizontal
phase.

If we wanted to do build-on-demand, then the algorithm I would
do is simply to loop over the rows and columns separately,
compute the needed horizontal and vertical phases, and then
fill them in.

You would want to change the data structure a bit from one
big array of doubles, to a 16x16 array of pointers to arrays
of doubles.

> > I'm a little wary of an approach, however, that works well
> > for scaling from 256x256 to 64x64 but suddenly takes much
> > longer and much more memory to scale to, say, 63x63.
>
> Another way of looking at it would be that the code would run
> faster for scaling some (256x256 or 64x64) but would run at the
> current speed for others (63x63).  There is no real good reason
> to penalize some scale operations by building the full table when
> it is not needed.

Yes and no ... as an optimization, it's fine to make some
sizes faster, but all the sizes have to be at least acceptably
fast.

> > If we stick with a table based approach for all scale factors,
> > one thing we might want to considers is simply computing each
> > 2-D filter coefficient as needed. If we convert the 1-D tables
> > to integers and normalize them to sum correctly, than this
> > should only require 1 multiplication per source pixel
> > read, which isn't a huge deal. (We are doing ~20 arithmetic
> > operations per destination pixel for the compositing.)
>
> Right a multiplication isn't really any more effort than an array
> access lookup anyway.
>
> Yes.  In other words, you are suggesting that the two vectors be
> vectors of integers rather than doubles, so that you can just multiply
> them together rather than having to do the "* 65536 + 0.5" for each
> cell in the convert_vector_to_table function (or getting rid of
> this convert function and doing the multiplications on the fly).

Yep, you take advantage of the associative rule:

\sum_i{a_i} = N
\sum_j{b_j} = N

implies:

\sum_{i,j}{a_i*b_j} = N*N

To deal with normalization.

> That said, I would like to mention the needs of mediaLib.  One nice
> thing we can do with mediaLib is to use the vector arrays of doubles
> directly rather than convert to integers at all.  This avoids needing
> to call the "correct_totals" function in the mediaLib case since there
> are no rounding errors.  Since mediaLib requires the vectors of doubles
> and GTK+ requires integers, the logic could be made flexible for both
> uses by doing the following:
>
> 1. introduce these errors outside of the *-make-weights function.
>    Perhaps by converting the double vectors to integer vectors in a
>    different function.
>
> 2. allow the *-make-weights functions to return both the double and
>    integer vectors.  Since the double value needs to be computed anyway
>    this wouldn't really be any extra work.
>
> 3. have the various *-make-weights functions take an argument to
>    specify whether the vectors should be returned as doubles or as
>    integers.
>
> 4. the mediaLib patch could just #ifdef the code to make the
>    -*make-weights functions to return double arrays.
>
> I think I like solution #3 the best.  It's clean, simple & clear.
> Solution #4 is my least favorite choice, since it would be nice if the
> mediaLib code could just integrate nicely with the *-make-weights
> functions without having to #ifdef them.  I think making the vector
> of doubles available is a nice thing since it is uncorrupted by
> rounding errors, and may be generally useful to future enhancements,
> etc.

I definitely would prefer 1; go with simplicity unless profiling
shows a bottleneck.

[...]

> > > 2. I put much thought into what to do with overall_alpha.
> > >    Spreading overall_alpha evenly over both dimensions is correct for
> > >    all situations except for when scaling an image in one dimension
> > >    only.  In this case you would obviously want to multiply that
> > >    dimension by overall_alpha (when overall_alpha != 1.0), instead of
> > >    sqrt(overall_alpha).  Note that GTK+ currently doesn't support
> > >    scaling in one dimension, but setting up the vectors in this
> > >    fashion takes into account the future evolution of the code.
> > >
> > >    I ended up deciding that the cleanest solution is to evenly
> > >    distribute overall_alpha across the x/y dimension.  The one corner
> > >    case mentioned above can easily be handled by multipling the
> > >    one vector used by the additional constant of sqrt(overall_alpha).
> > >
> > >    An alternative solution that is equally correct is to force the
> > >    *make-weights functions to *always* return tables that sum to 1.0
> > >    (rather than to overall_alpha).  And then move the multiplication of
> > >    overall_alpha to the conversion function.  This would be slighly
> > >    slower, but might be considered more mathematically pure.
> >
> > This is almost certainly the right thing to do. The overall alpha
> > has nothing to do with the structure of the filter tables; it's
> > just included in the filter tables as an optimization to speed
> > things up. It might even make sense to drop the optimization entirely
> > and do it when computing each pixel if that would simplify dealing
> > with roundoff errors.
>
> It's not clear to me whether you are suggesting that the tables should
> sum to 1.0 or to overall_alpha as being "the right thing to do".  :)

I was saying that the X and Y vectors should sum to 1, not to
sqrt(overall_alpha).

[...]

> Speaking of function name confusion, our mediaLib expert had this
> question about function naming in GTK+:
>
> I am confused by the notation of 0888 and 8880 on function names
> in this file.  First I thought 0 is used to mark the alpha channel.
> But if that's the case, then gdk_rgb_convert_0888_br() and
> gdk_rgb_convert_8880_br() should exchange their names

is that there is no overall convention that covers all of GDK
and gdk-pixbuf. Different portions have different conventions.

[...]

> Have you ever considered making the scale/composite functions allow
> the user to specify their own interpolation table/vectors?

If we reworked the API as I sketched out yesterday, this would
certainly be possible. I'm not sure it's actually useful ...
the current API already has too much flexibility for most people
in choice of filtering.

Regards,
Owen