Re: gdk_pixbuf interpolation tables

From: Brian Cameron <Brian Cameron sun com>
To: Brian Cameron sun com, otaylor redhat com
Cc: gtk-devel-list gnome org, James Cheng sun com, Paul Sandoz sun com, leo binchy sun com
Subject: Re: gdk_pixbuf interpolation tables
Date: Thu, 20 Feb 2003 10:41:11 +0000 (GMT)
Owen:

> > > I've also documented in there what the filters are that are being
> > > used.
> > > 
> > > The observation that all of the filters we use now are separable
> > > is an important one. (I had some inkling of this before as shown
> > > by how I defined the "BILINEAR" filter, but never really made the
> > > generalization). It certainly does reduce the complexity and
> > > expense of  calculating the tables a lot, and anything that makes
> > > pixops.c shorter and simpler (not to mention faster) is a good thing.
> > >
> > > Still, I wonder if pursuing this is really right in the long
> > > term:
> > > 
> > >  - Once calculating the filter tables gets to be a big
> > >    bottleneck, we've already are in a regime where our
> > >    algorithms aren't a good idea.
> > 
> > I should point out that the reason that I made this change had nothing
> > to do with performance.  I wanted to more tightly integrate the GTK+
> > scaling code with Sun's mediaLib library (which takes advantage of the
> > VIS multimedia speed up chip).  The most generic mediaLib scaling 
> > functions have the following limitations:
> 
> To be really blunt, I don't care about medialib.
> 
> If we can improve the gdk-pixbuf code and that happens to also
> help medialib integration, great.
> 
> But restructuring gdk-pixbuf to make it possible to backend off
> of a closed source library has very small interest for me.
> 
> Any major changes have be justified in terms of what they do for
> the code shipped with gdk-pixbuf.

Obviously.  Of course I am not suggesting that GTK should do anything
inappropriate or time-consuming in order to make life easier for
mediaLib.  However, if we can arrange the code so it works well for
both GTK+ and mediaLib without compromising either, that would be
nice.  The only reason I was explaining about mediaLib at all is
because:

1. If the code can easily be made to work for both without any
   compromising, then we can plan to take advantage of this
   synergy.
2. Since my patch was made to make GTK+ integrate with mediaLib
   I wanted you to understand where the patch came from, and that
   the work wasn't done for general performance tuning reasons, etc.

Although I hope that we can make the same code work for both GTK+ and
mediaLib, if we can't then this simply means that mediaLib will need
to #ifdef it's own *make-weights functions which is no big deal.
 
> > > > 2. Separated tables can save memory.  It is only necessary to build
> > > >    the full 2d tables as they are accessed.  For example, most of
> > > >    the pixops_process functions deal with 1 line of tables.  Therefore
> > > >    the logic could easily be modified to only build each line of 
> > > >    tables as needed.  This has no extra CPU cost, but reduces memory
> > > >    used.  I did not yet implement this change, however I think it is
> > > >   clear that this would be fairly straightforward to do.
> > > 
> > > Building the tables on a as-needed bases doesn't save you much
> > > on "average", since for almost all scale factors, you'll need
> > > all the horizontal and vertical phases. Though the cases
> > > where the scale-down factor is a small integer fraction are
> > > certainly more common than others. (That is, to scale from
> > > 200 to 160, the scale factor is 5:4, so we only need 4 phases
> > > 0, 0.25, 0.50, 0.75 rather than all 16 phases.)
> > 
> > The only real advantage of building the tables as needed is that you
> > don't have to build the 2d table to begin with, saving the memory and
> > time.  However, I noticed that the table is accessed multiple times
> > in pixops_process (it is passed to process_pixel then to *line_func
> > and then to process_pixel again), so the code may need to be 
> > reorganized in order to really benefit from this sort of technique.
> > Obviously you lose the value of calculating the value on-the-fly
> > if you have to do the calculation multiple times.
> > 
> > Another advantage is that if you are only scaling in one dimention,
> > then you only need to build one vector!  The code to support this
> > isn't really there yet, but I think the concept is clear.
> >
> > > I think it probably would work best to compute the needed phases
> > > in advance and pregenerate them rather than doing it on-demand,
> > > since it avoids complexity in the inner horizontal loop.
> > 
> > Yes, what could be done is just to build each line of tables as
> > needed.  Especially if reorganizing the code isn't possible.  
> > Each line could be built as it is accessed for the first time,
> > which would also save time.
> 
> But you don't actually always need all the filters for a complete
> line ... remember, though code computes 16x16 filter matrices
> for 16 phases horizontally and vertically. If you are scaling
> horizontally by an integer factor, then you only need one horizontal
> phase.
>
> If we wanted to do build-on-demand, then the algorithm I would
> do is simply to loop over the rows and columns separately, 
> compute the needed horizontal and vertical phases, and then 
> fill them in.
> 
> You would want to change the data structure a bit from one
> big array of doubles, to a 16x16 array of pointers to arrays
> of doubles.

Right. That sounds like a good way to organize the code to do
on-demand building of the tables.  Would you compute which tables
you needed and build them all at once at the beginning of the
pixops_process function or would you build them in the process_pixel
function as they are being accessed?

Though this does have the disadvantage that it doesn't speed all
scale operations equally.  As you point out below, some users may
be annoyed that some scales take longer than others.  However, I'm
not sure that making some scale operations faster is necessarily
such a problem.

> > > I'm a little wary of an approach, however, that works well
> > > for scaling from 256x256 to 64x64 but suddenly takes much
> > > longer and much more memory to scale to, say, 63x63.
> > 
> > Another way of looking at it would be that the code would run
> > faster for scaling some (256x256 or 64x64) but would run at the
> > current speed for others (63x63).  There is no real good reason
> > to penalize some scale operations by building the full table when
> > it is not needed.
> 
> Yes and no ... as an optimization, it's fine to make some
> sizes faster, but all the sizes have to be at least acceptably
> fast.

They would all be at least as fast as the current code.  I don't 
think the overhead of building the tables as-needed would add any
significant time to the operation.  Assuming that the current code
is "acceptably fast", then there shouldn't be a problem there.

> > > If we stick with a table based approach for all scale factors,
> > > one thing we might want to considers is simply computing each 
> > > 2-D filter coefficient as needed. If we convert the 1-D tables
> > > to integers and normalize them to sum correctly, than this
> > > should only require 1 multiplication per source pixel
> > > read, which isn't a huge deal. (We are doing ~20 arithmetic 
> > > operations per destination pixel for the compositing.)
> > 
> > Right a multiplication isn't really any more effort than an array
> > access lookup anyway.
> > 
> > Yes.  In other words, you are suggesting that the two vectors be
> > vectors of integers rather than doubles, so that you can just multiply
> > them together rather than having to do the "* 65536 + 0.5" for each
> > cell in the convert_vector_to_table function (or getting rid of
> > this convert function and doing the multiplications on the fly).
> 
> Yep, you take advantage of the associative rule:
> 
>  \sum_i{a_i} = N
>  \sum_j{b_j} = N
>  
> implies:
> 
>  \sum_{i,j}{a_i*b_j} = N*N
> 
> To deal with normalization.

Right.
 
> > That said, I would like to mention the needs of mediaLib.  One nice
> > thing we can do with mediaLib is to use the vector arrays of doubles
> > directly rather than convert to integers at all.  This avoids needing
> > to call the "correct_totals" function in the mediaLib case since there
> > are no rounding errors.  Since mediaLib requires the vectors of doubles
> > and GTK+ requires integers, the logic could be made flexible for both
> > uses by doing the following:
> > 
> > 1. introduce these errors outside of the *-make-weights function.
> >    Perhaps by converting the double vectors to integer vectors in a
> >    different function.
> >    
> > 2. allow the *-make-weights functions to return both the double and
> >    integer vectors.  Since the double value needs to be computed anyway
> >    this wouldn't really be any extra work.
> >    
> > 3. have the various *-make-weights functions take an argument to 
> >    specify whether the vectors should be returned as doubles or as
> >    integers.
> >    
> > 4. the mediaLib patch could just #ifdef the code to make the 
> >    -*make-weights functions to return double arrays. 
> >    
> > I think I like solution #3 the best.  It's clean, simple & clear.
> > Solution #4 is my least favorite choice, since it would be nice if the
> > mediaLib code could just integrate nicely with the *-make-weights
> > functions without having to #ifdef them.  I think making the vector
> > of doubles available is a nice thing since it is uncorrupted by
> > rounding errors, and may be generally useful to future enhancements,
> > etc.
> 
> I definitely would prefer 1; go with simplicity unless profiling
> shows a bottleneck.

I think this approach is a good one.  While looping over the vectors
a second time will add a tiny bit of work, this approach does have 
some nice advantages.  Mainly it makes the interpolation tables highly
generic and clear.  The fact that both mediaLib and GTK+ would be able
to make use of them is a positive reflection of code that builds good
generic tables.  In other words, keeping the logic specific to how
the tables are being used (like, as a table of integers - or how
overall_alpha is being used) out of the logic that builds the
mathematically pure interpolation tables is a good separation, I
think.
 
> [...]
> 
> > > > 2. I put much thought into what to do with overall_alpha. 
> > > >    Spreading overall_alpha evenly over both dimensions is correct for
> > > >    all situations except for when scaling an image in one dimension
> > > >    only.  In this case you would obviously want to multiply that
> > > >    dimension by overall_alpha (when overall_alpha != 1.0), instead of
> > > >    sqrt(overall_alpha).  Note that GTK+ currently doesn't support
> > > >    scaling in one dimension, but setting up the vectors in this
> > > >    fashion takes into account the future evolution of the code.
> > > >    
> > > >    I ended up deciding that the cleanest solution is to evenly 
> > > >    distribute overall_alpha across the x/y dimension.  The one corner
> > > >    case mentioned above can easily be handled by multipling the 
> > > >    one vector used by the additional constant of sqrt(overall_alpha).
> > > >    
> > > >    An alternative solution that is equally correct is to force the
> > > >    *make-weights functions to *always* return tables that sum to 1.0
> > > >    (rather than to overall_alpha).  And then move the multiplication of
> > > >    overall_alpha to the conversion function.  This would be slighly
> > > >    slower, but might be considered more mathematically pure.
> > > 
> > > This is almost certainly the right thing to do. The overall alpha
> > > has nothing to do with the structure of the filter tables; it's
> > > just included in the filter tables as an optimization to speed
> > > things up. It might even make sense to drop the optimization entirely
> > > and do it when computing each pixel if that would simplify dealing
> > > with roundoff errors.
> > 
> > It's not clear to me whether you are suggesting that the tables should
> > sum to 1.0 or to overall_alpha as being "the right thing to do".  :)
> 
> I was saying that the X and Y vectors should sum to 1, not to
> sqrt(overall_alpha).

Yes, I think that is more clean.

> [...]
> 
> > Speaking of function name confusion, our mediaLib expert had this
> > question about function naming in GTK+:
> > 
> > I am confused by the notation of 0888 and 8880 on function names
> > in this file.  First I thought 0 is used to mark the alpha channel.
> > But if that's the case, then gdk_rgb_convert_0888_br() and
> > gdk_rgb_convert_8880_br() should exchange their names 
> 
> I seem to remembering answering this somewhere. The basic answer
> is that there is no overall convention that covers all of GDK
> and gdk-pixbuf. Different portions have different conventions.

I see.  I did ask you this before, and perhaps you answered and I
forgot.  I didn't remember getting a response to my previous question.
Apologies, but thanks for the info.

> [...]
> 
> > Have you ever considered making the scale/composite functions allow
> > the user to specify their own interpolation table/vectors?
> 
> If we reworked the API as I sketched out yesterday, this would 
> certainly be possible. I'm not sure it's actually useful ... 
> the current API already has too much flexibility for most people
> in choice of filtering.

Right.  It probably wouldn't be of great use to the general user.  
However, some applications that have very specific and high-end
graphics needs may wish to use alternative or specific tables.  I
could see applications for some users of programs like gimp.

Brian
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]