gdk_pixbuf interpolation tables



GTK developers:

I suspect some of you may remember the email I sent to this list 
on January 8th asking for help with separating the various interpolation
tables.  I have made good progress, and am attaching a patch for review,
and a test program which demonstrates that the new methods generate 
precisely the same interpolation tables while using much less CPU.
Since the details are involved, and some of my decisions require 
explination/justification, I will discuss this below.

Before I go into the details I want to thank James Cheng of the Sun
mediaLib team for pointing out these opportunities in the first
place and for giving me the direction and motivation to make the
changes.  I also want to thank Paul Sandoz who is a Sun developer
who helped me separate the PIXOPS_INTERP_HYPER table.

First I will discuss the advantages of using separated tables.  Most
interpolation tables are separable (including all the ones used
currently by GTK+).  A separatable table is a table that can be built
by multiplying two vectors together.  One vector is a 1d table that
has n_x*SUBSAMPLE values and the other vector is a 1d table that has
n_y*SUBSAMPLE values.  The full 2d table can be generated by simply
multiplying the two vectors together.

Advantages of separated interpolation tables:

1. Building separated tables saves CPU time.  The math involved with
   computing the vectors is typically involved, with if-tests
   embedded in for-loops, etc.  Doing this complicated work only
   when building the 1d vectors, and doing a simple multiplication
   to generate the 2d table is obviously faster than doing the
   involved math to build every cell in the 2d tables.

2. Separated tables can save memory.  It is only necessary to build
   the full 2d tables as they are accessed.  For example, most of
   the pixops_process functions deal with 1 line of tables.  Therefore
   the logic could easily be modified to only build each line of 
   tables as needed.  This has no extra CPU cost, but reduces memory
   used.  I did not yet implement this change, however I think it is
   clear that this would be fairly straightforward to do.
   
3. Separated tables allow opportunities to further optimize scale
   operations.  With separated tables it is straightforward to scale
   the x/y dimensions independantly.  In some cases scaling one
   dimension first can be faster than in the other order (like when
   an image only needs to be scaled in a single dimension in the first
   place).  Another example is when an image contains solid colors in
   one dimension (in other words is stripey) and memcpy can be used on
   that dimension instead of a more costly scaling operation.

4. I think anyone who looks at the patch will agree that the logic
   used for separated tables is much more clear and readible, and
   therefore also easier to maintain.

First I want to highlight the changes I made to the PixopsFilter 
structure.  You will notice that it now contains two arrays of 
doubles that corresponds to the two vectors instead of the array
of ints (actually used as a 2d array).  Also the structure now
contains the overall_alpha value.  Let me explain why there is 
value in the PixopsFilter returning these values instead of having
the various *make-weights functions build the 2d array of ints
directly.

1. Some of the (yet unimplemented) improvements mentioned above
   such as building subsections of the table on-the-fly as needed
   or scaling the x/y dimensions independantly require having 
   access to the vectors directly.  In these cases the 2d vector of
   ints is not needed at all, so building the 2d array in the
   *make-weights functions directly creates an unnecessary limitation
   for future improvements.

2. You will notice that the vectors generated now all sum to 1.0
   (or from 0 to overall_alpha in the case when overall_alpha != 1.0).
   The multiplication of 62256 and the addition of the 0.5 is done
   in the convert_vector_to_table function since these operations
   are specific to converting the vectors to a 2d array of integers.
   In the future, some situations may wish to use the double values
   directly.  In other words, it is simply more flexible to leave the
   math which assumes the end result is a 2d table of integers
   outside of the function that builds the vectors to begin with.
   
3. The need to include overall_alpha in the structure is that it
   indicates what the tables sum to.  The vectors will always sum
   to overall_alpha.  This information is needed for later processing
   (correcting totals for 2d integer tables, or when scaling only
   in one dimension - see below for more information about this).

In the various *make-weights functions you will notice that I am
multiplying the x-vector by the (x_scale * sqrt(overall_alpha))
constant and the y-vector by the (y_scale * sqrt(overall_alpha))
constant.  Since the values just get multipled together when building
the 2d table anyway, you might reasonably wonder why so much effort
is going into separating the constants equally on the two vectors.
The reasons are as follows:

1. Multipling the x vector by x_scale and the y_vector by y_scale
   supports the ability to scale the x/y dimensions independantly.
   Such independant scaling would obviously be unbalanced if the
   constants were only applied to one vector.  The same logic is
   why both vectors in bilinear_make_weights are multiplied by 0.5
   rather than multiplying one vector by 0.25.
   
2. I put much thought into what to do with overall_alpha. 
   Spreading overall_alpha evenly over both dimensions is correct for
   all situations except for when scaling an image in one dimension
   only.  In this case you would obviously want to multiply that
   dimension by overall_alpha (when overall_alpha != 1.0), instead of
   sqrt(overall_alpha).  Note that GTK+ currently doesn't support
   scaling in one dimension, but setting up the vectors in this
   fashion takes into account the future evolution of the code.
   
   I ended up deciding that the cleanest solution is to evenly 
   distribute overall_alpha across the x/y dimension.  The one corner
   case mentioned above can easily be handled by multipling the 
   one vector used by the additional constant of sqrt(overall_alpha).
   
   An alternative solution that is equally correct is to force the
   *make-weights functions to *always* return tables that sum to 1.0
   (rather than to overall_alpha).  And then move the multiplication of
   overall_alpha to the conversion function.  This would be slighly
   slower, but might be considered more mathematically pure.

   If the community feels that the alternative implementation makes 
   more sense, then I am happy to update the patch.  I suspect I'm
   splitting hairs here, and worrying overmuch about this.  I 
   suspect either solution is fine.  However, I wanted to explain
   my thinking and why I chose the solution I did.

Now I want to explain some of the changes made to the various
*make-weights functions:

1. A significant amount of the math used to build the 2-d table in
   bilinear_quadrant() was unneccessary.  Note how the equations used in
   bilinear_quadrant() and bilinear_make_weights() became clearly more
   simple after factoring out the x/y into separate vectors.  This
   was the hardest to separate, so I will explain it in the most detail.
   
   Here is the logic which allowed me (with the help of Paul Sandoz)
   to separate the original logic into two separated vectors:
   
   The weight for one element (ignoring scalar constants) after
   bilinear_make_weights() calls bilinear_quadrant four times is:

   = X1 * Y1 + X2 * Y1 + X1 * Y2 + X2 * Y2

   = X1 * (Y1 + Y2) + X2 * (Y1 + Y2)

   = (X1 + X2) * (Y1 + Y2)

   (X1 + X2) will vary for all j when i is constant. (Y1 + Y2) will be 
   constant for all j when i is constant. Thus we have a value in one of 
   the 1D matrix, (Y1 + Y2).

   The same can be applied when j is constant and i is varied.  Therefore
   this is how it is possible to factor out the matrix into the separable
   form.

2. bilinear_make_fast_weights is the most common interpolation table
   function used and it was already building the separate vectors 
   and multiplying them together to generate the full 2-d table anyway.
   However, bilinear_make_fast_weights was unnecessarily slow because it
   builds the same vectors over and over again due to the unnecessary
   double for-loop of i_offset and j_offset (in other words
   SUBSAMPLE * SUBSAMPLE).  Note that the new code only loops over
   SUBSAMPLE a single time to builds the vectors.

3. Other improvements were made to the various *make-weights functions
   to remove unnecessary computation from within for-loops, and 
   factoring out redundant calculations.  You will notice that the math
   in the new code is considerably more simple than the previous logic.
   
Lastly I would like to discuss a few problems that I noticed with the
interpolation tables that are generated with INTERP_PIXOPS_HYPER.  These
problems exists equally in the current code and in my patch.  

1. I notice that when using INTERP_PIXOPS_TILE and INTERP_PIXOPS_BILINEAR
   that the corrections range from -16 to +16.  This range of correction
   makes sense due to correcting for rounding errors.  This range of 
   32 is only .04% of 65536, so the correction will obviously not 
   greatly affect the end result.
   
   However, with INTERP_PIXOPS_HYPER the correction is often very large
   sometimes as high as 11943.  This is 18.22% of 65536, which is 
   obviously significant enough to cause significant distortion to the
   filter when the correction is not applied evenly to the filter.  The
   current correct_total() implementation only corrects the 1st value
   in the table it finds that can handle the correction without going
   negative.  I suspect that this creates unacceptable results, especially
   for a filter which is intended to be of highest quality (and is also
   the slowest).  I suspect that the problem could be corrected by 
   improving the logic of bilinear_make_weights.  However, the problem
   could also be addressed by distributing the correction more evenly
   across the filter.

A few quick pointers about the test program scale.c:

Note the following #defines at the top of the function:

#define DEBUG_PRINT_CORRECTION 0
#define DEBUG_PRINT_TABLE 0
#define DEBUG_PRINT_VECTOR 0

These can be set to any non-zero value to see additional debug output.
Without changing the above values, the test program simply prints some
data about each test case, and any differences found between the 2d
table generated via the prior functions and the 2d table generated by
multiplying the two vectors together.  Note that there are no differences.
The tables are exactly the same using both methods.

Brian
Index: gtk+/gdk-pixbuf/pixops/pixops.c
===================================================================
RCS file: /sgnome/cvsroots/sun-gnome/gtk+/gdk-pixbuf/pixops/pixops.c,v
retrieving revision 1.1.1.1
diff -u -p -r1.1.1.1 pixops.c
--- gtk+/gdk-pixbuf/pixops/pixops.c	14 Oct 2002 17:05:23 -0000	1.1.1.1
+++ gtk+/gdk-pixbuf/pixops/pixops.c	6 Feb 2003 10:22:34 -0000
@@ -14,11 +14,13 @@ typedef struct _PixopsFilter PixopsFilte
 
 struct _PixopsFilter
 {
-  int *weights;
+  double *x_weights;
+  double *y_weights;
   int n_x;
   int n_y;
   double x_offset;
   double y_offset;
+  double overall_alpha;
 }; 
 
 typedef guchar *(*PixopsLineFunc) (int *weights, int n_x, int n_y,
@@ -938,6 +940,62 @@ process_pixel (int *weights, int n_x, in
   (*pixel_func) (dest, dest_x, dest_channels, dest_has_alpha, src_has_alpha, check_size, color1, color2, r, g, b, a);
 }
 
+static void 
+correct_total (int    *weights, 
+               int    n_x, 
+               int    n_y,
+               int    total, 
+               double overall_alpha)
+{
+  int correction = (int)(0.5 + 65536 * overall_alpha) - total;
+
+  if (correction != 0)
+    {
+      int i;
+      for (i = n_x * n_y - 1; i >= 0; i--) 
+        {
+          if (*(weights + i) + correction >= 0) 
+            {
+              *(weights + i) += correction;
+              break;
+            }
+        }
+    }
+}
+
+static int *
+convert_vector_to_table(PixopsFilter *vector)
+{
+  int i_offset, j_offset;
+  int n_x = vector->n_x;
+  int n_y = vector->n_y;
+  int * weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);
+
+  for (i_offset=0; i_offset < SUBSAMPLE; i_offset++)
+    for (j_offset=0; j_offset < SUBSAMPLE; j_offset++)
+      {
+        double weight;
+        int *pixel_weights = weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
+        int total = 0;
+        int i,j;
+
+        for (i=0; i < n_y; i++)
+          for (j=0; j < n_x; j++)
+            {
+              weight = vector->x_weights[(j_offset * n_x) + j] *
+                       vector->y_weights[(i_offset * n_y) + i] * 65536 + 0.5;
+
+              total += (int)weight;
+
+              *(pixel_weights + n_x * i + j) = weight;
+            }
+
+        correct_total (pixel_weights, n_x, n_y, total, vector->overall_alpha);
+      }
+
+  return weights;
+}
+
 static void
 pixops_process (guchar         *dest_buf,
 		int             render_x0,
@@ -968,6 +1026,8 @@ pixops_process (guchar         *dest_buf
   int x, y;			/* X and Y position in source (fixed_point) */
   guchar **line_bufs = g_new (guchar *, filter->n_y);
 
+  int *filter_weights = convert_vector_to_table(filter);
+
   int x_step = (1 << SCALE_SHIFT) / scale_x; /* X step in source (fixed point) */
   int y_step = (1 << SCALE_SHIFT) / scale_y; /* Y step in source (fixed point) */
 
@@ -997,7 +1057,9 @@ pixops_process (guchar         *dest_buf
       int dest_x;
       int y_start = y >> SCALE_SHIFT;
       int x_start;
-      int *run_weights = filter->weights + ((y >> (SCALE_SHIFT - SUBSAMPLE_BITS)) & SUBSAMPLE_MASK) * filter->n_x * filter->n_y * SUBSAMPLE;
+      int *run_weights = filter_weights +
+                         ((y >> (SCALE_SHIFT - SUBSAMPLE_BITS)) & SUBSAMPLE_MASK) *
+                         filter->n_x * filter->n_y * SUBSAMPLE;
       guchar *new_outbuf;
       guint32 tcolor1, tcolor2;
       
@@ -1074,320 +1136,269 @@ pixops_process (guchar         *dest_buf
     }
 
   g_free (line_bufs);
+  g_free (filter_weights);
 }
 
-static void 
-correct_total (int    *weights, 
-	       int    n_x, 
-	       int    n_y,
-	       int    total, 
-	       double overall_alpha)
-{
-  int correction = (int)(0.5 + 65536 * overall_alpha) - total;
-  int i;
-  for (i = n_x * n_y - 1; i >= 0; i--) 
-    {
-      if (*(weights + i) + correction >= 0) 
-	{
-	  *(weights + i) += correction;
-	  break;
-	}
-    }
-}  
-
 static void
 tile_make_weights (PixopsFilter *filter, double x_scale, double y_scale, double overall_alpha)
 {
-  int i_offset, j_offset;
-
-  int n_x = ceil(1/x_scale + 1);
-  int n_y = ceil(1/y_scale + 1);
+  double *x_pixel_weights;
+  double *y_pixel_weights;
+  double sqrt_alpha = sqrt(overall_alpha);
+  double x_constant = sqrt_alpha * x_scale;
+  double y_constant = sqrt_alpha * y_scale;
+  int n_x = ceil (1/x_scale + 1);
+  int n_y = ceil (1/y_scale + 1);
+  int offset;
+  int i;
 
   filter->x_offset = 0;
   filter->y_offset = 0;
   filter->n_x = n_x;
   filter->n_y = n_y;
-  filter->weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);
-
-  for (i_offset=0; i_offset<SUBSAMPLE; i_offset++)
-    for (j_offset=0; j_offset<SUBSAMPLE; j_offset++)
-      {
-	int *pixel_weights = filter->weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
-	double x = (double)j_offset / SUBSAMPLE;
-	double y = (double)i_offset / SUBSAMPLE;
-	int i,j;
-	int total = 0;
-	  
-	for (i = 0; i < n_y; i++)
-	  {
-	    double tw, th;
-		
-	    if (i < y)
-	      {
-
-		if (i + 1 > y)
-		  th = MIN(i+1, y + 1/y_scale) - y;
-		else
-		  th = 0;
-	      }
-	    else
-	      {
-		if (y + 1/y_scale > i)
-		  th = MIN(i+1, y + 1/y_scale) - i;
-		else
-		  th = 0;
-	      }
-		
-	    for (j = 0; j < n_x; j++)
-	      {
-		int weight;
-		
-		if (j < x)
-		  {
-		    if (j + 1 > x)
-		      tw = MIN(j+1, x + 1/x_scale) - x;
-		    else
-		      tw = 0;
-		  }
-		else
-		  {
-		    if (x + 1/x_scale > j)
-		      tw = MIN(j+1, x + 1/x_scale) - j;
-		    else
-		      tw = 0;
-		  }
-
-		weight = 65536 * tw * x_scale * th * y_scale * overall_alpha + 0.5;
-		total += weight;
-		*(pixel_weights + n_x * i + j) = weight;
-	      }
-	  }
-	
-	correct_total (pixel_weights, n_x, n_y, total, overall_alpha);
-      }
+  filter->x_weights = g_new (double, SUBSAMPLE * n_x);
+  filter->y_weights = g_new (double, SUBSAMPLE * n_y);
+  filter->overall_alpha = overall_alpha;
+
+  x_pixel_weights = filter->x_weights;
+  y_pixel_weights = filter->y_weights;
+
+  for (offset=0; offset < SUBSAMPLE; offset++)
+    {
+      double x = (double)offset / SUBSAMPLE;
+      double a = x + 1 / x_scale;
+      double b = x + 1 / y_scale;
+
+      for (i = 0; i < n_x; i++)
+        {
+          if (i < x)
+            {
+              if (i + 1 > x)
+                *(x_pixel_weights++)  = (MIN (i + 1, a) - x) * x_constant;
+              else
+                *(x_pixel_weights++) = 0;
+            }
+          else
+            {
+              if (a > i)
+                *(x_pixel_weights++)  = (MIN (i + 1, a) - i) * x_constant;
+              else
+                *(x_pixel_weights++) = 0;
+            }
+       }
+        
+      for (i = 0; i < n_y; i++)
+        {
+          if (i < x)
+            {
+              if (i + 1 > x)
+                *(y_pixel_weights++) = (MIN (i + 1, b) - x) * y_constant;
+              else
+                *(y_pixel_weights++) = 0;
+            }
+          else
+            {
+              if (b > i)
+                *(y_pixel_weights++) = (MIN (i + 1, b) - i) * y_constant;
+              else
+                *(y_pixel_weights++) = 0;
+            }
+        }
+    }
 }
 
 static void
 bilinear_make_fast_weights (PixopsFilter *filter, double x_scale, double y_scale, double overall_alpha)
 {
-  int i_offset, j_offset;
-  double *x_weights, *y_weights;
+  double *x_pixel_weights;
+  double *y_pixel_weights;
+  double sqrt_alpha = sqrt(overall_alpha);
+  double x_constant = sqrt_alpha * x_scale;
+  double y_constant = sqrt_alpha * y_scale;
   int n_x, n_y;
+  int offset;
+  int i;
 
-  if (x_scale > 1.0)		/* Bilinear */
+  if (x_scale > 1.0)            /* Bilinear */
     {
       n_x = 2;
-      filter->x_offset = 0.5 * (1/x_scale - 1);
+      filter->x_offset = 0.5 * (1 / x_scale - 1);
     }
-  else				/* Tile */
+  else                          /* Tile */
     {
-      n_x = ceil(1.0 + 1.0/x_scale);
+      n_x = ceil (1.0 + 1.0 / x_scale);
       filter->x_offset = 0.0;
     }
 
-  if (y_scale > 1.0)		/* Bilinear */
+  if (y_scale > 1.0)            /* Bilinear */
     {
       n_y = 2;
-      filter->y_offset = 0.5 * (1/y_scale - 1);
+      filter->y_offset = 0.5 * (1 / y_scale - 1);
     }
-  else				/* Tile */
+  else                          /* Tile */
     {
-      n_y = ceil(1.0 + 1.0/y_scale);
+      n_y = ceil (1.0 + 1.0 / y_scale);
       filter->y_offset = 0.0;
     }
 
   filter->n_y = n_y;
   filter->n_x = n_x;
-  filter->weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);
-
-  x_weights = g_new (double, n_x);
-  y_weights = g_new (double, n_y);
-
-  for (i_offset=0; i_offset<SUBSAMPLE; i_offset++)
-    for (j_offset=0; j_offset<SUBSAMPLE; j_offset++)
-      {
-	int *pixel_weights = filter->weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
-	double x = (double)j_offset / SUBSAMPLE;
-	double y = (double)i_offset / SUBSAMPLE;
-	int i,j;
-	int total = 0;
-
-	if (x_scale > 1.0)	/* Bilinear */
-	  {
-	    for (i = 0; i < n_x; i++)
-	      {
-		x_weights[i] = ((i == 0) ? (1 - x) : x) / x_scale;
-	      }
-	  }
-	else			/* Tile */
-	  {
-	    /*           x
-	     * ---------|--.-|----|--.-|-------  SRC
-	     * ------------|---------|---------  DEST
-	     */
-	    for (i = 0; i < n_x; i++)
-	      {
-		if (i < x)
-		  {
-		    if (i + 1 > x)
-		      x_weights[i] = MIN(i+1, x + 1/x_scale) - x;
-		    else
-		      x_weights[i] = 0;
-		  }
-		else
-		  {
-		    if (x + 1/x_scale > i)
-		      x_weights[i] = MIN(i+1, x + 1/x_scale) - i;
-		    else
-		      x_weights[i] = 0;
-		  }
-	      }
-	  }
-
-	if (y_scale > 1.0)	/* Bilinear */
-	  {
-	    for (i = 0; i < n_y; i++)
-	      {
-		y_weights[i] = ((i == 0) ? (1 - y) : y) / y_scale;
-	      }
-	  }
-	else			/* Tile */
-	  {
-	    /*           y
-	     * ---------|--.-|----|--.-|-------  SRC
-	     * ------------|---------|---------  DEST
-	     */
-	    for (i = 0; i < n_y; i++)
-	      {
-		if (i < y)
-		  {
-		    if (i + 1 > y)
-		      y_weights[i] = MIN(i+1, y + 1/y_scale) - y;
-		    else
-		      y_weights[i] = 0;
-		  }
-		else
-		  {
-		    if (y + 1/y_scale > i)
-		      y_weights[i] = MIN(i+1, y + 1/y_scale) - i;
-		    else
-		      y_weights[i] = 0;
-		  }
-	      }
-	  }
-
-	for (i = 0; i < n_y; i++)
-	  for (j = 0; j < n_x; j++)
-	    {
-	      int weight = 65536 * x_weights[j] * x_scale * y_weights[i] * y_scale * overall_alpha + 0.5;
-	      *(pixel_weights + n_x * i + j) = weight;
-	      total += weight;
-	    }
-
-	correct_total (pixel_weights, n_x, n_y, total, overall_alpha);
-      }
+  filter->x_weights = g_new (double, SUBSAMPLE * n_x);
+  filter->y_weights = g_new (double, SUBSAMPLE * n_y);
+  filter->overall_alpha = overall_alpha;
+
+  x_pixel_weights = filter->x_weights;
+  y_pixel_weights = filter->y_weights;
+
+  for (offset=0; offset < SUBSAMPLE; offset++)
+    {
+      double x = (double)offset / SUBSAMPLE;
+
+      if (x_scale > 1.0)      /* Bilinear */
+        {
+          for (i = 0; i < n_x; i++)
+            *(x_pixel_weights++) = (((i == 0) ? (1 - x) : x) / x_scale) * x_constant;
+        }
+      else                  /* Tile */
+        {
+          double a = x + 1 / x_scale;
+
+          /*           x
+           * ---------|--.-|----|--.-|-------  SRC
+           * ------------|---------|---------  DEST
+           */
+          for (i = 0; i < n_x; i++)
+            {
+              if (i < x)
+                {
+                  if (i + 1 > x)
+                    *(x_pixel_weights++) = (MIN (i + 1, a) - x) * x_constant;
+                  else
+                    *(x_pixel_weights++) = 0;
+                }
+              else
+                {
+                  if (a > i)
+                    *(x_pixel_weights++) = (MIN (i + 1, a) - i) * x_constant;
+                  else
+                    *(x_pixel_weights++) = 0;
+                }
+            }
+        }
 
-  g_free (x_weights);
-  g_free (y_weights);
+      if (y_scale > 1.0)      /* Bilinear */
+        {
+          for (i = 0; i < n_y; i++)
+            *(y_pixel_weights++) = (((i == 0) ? (1 - x) : x) / y_scale) * y_constant;
+        }
+      else                  /* Tile */
+        {
+          double b = x + 1 / y_scale;
+
+          /*           x
+           * ---------|--.-|----|--.-|-------  SRC
+           * ------------|---------|---------  DEST
+           */
+          for (i = 0; i < n_y; i++)
+            {
+              if (i < x)
+                {
+                  if (i + 1 > x)
+                    *(y_pixel_weights++) = (MIN (i + 1, b) - x) * y_constant;
+                  else
+                    *(y_pixel_weights++) = 0;
+                }
+              else
+                {
+                  if (b > i)
+                    *(y_pixel_weights++) = (MIN (i + 1, b) - i) * y_constant;
+                  else
+                    *(y_pixel_weights++) = 0;
+                }
+            }
+        }
+    }
 }
 
 static double
-bilinear_quadrant (double bx0, double bx1, double by0, double by1)
+bilinear_quadrant (double b0, double b1)
 {
-  double ax0, ax1, ay0, ay1;
-  double x0, x1, y0, y1;
-
-  ax0 = 0.;
-  ax1 = 1.;
-  ay0 = 0.;
-  ay1 = 1.;
+  double a0, a1;
+  double x0, x1;
 
-  if (ax0 < bx0)
-    {
-      if (ax1 > bx0)
-	{
-	  x0 = bx0;
-	  x1 = MIN (ax1, bx1);
-	}
-      else
-	return 0;
-    }
-  else
-    {
-      if (bx1 > ax0)
-	{
-	  x0 = ax0;
-	  x1 = MIN (ax1, bx1);
-	}
-      else
-	return 0;
-    }
+  a0 = 0.;
+  a1 = 1.;
 
-  if (ay0 < by0)
+  if (a0 < b0)
     {
-      if (ay1 > by0)
-	{
-	  y0 = by0;
-	  y1 = MIN (ay1, by1);
-	}
+      if (a1 > b0)
+        {
+          x0 = b0;
+          x1 = MIN (a1, b1);
+        }
       else
-	return 0;
+        return 0;
     }
   else
     {
-      if (by1 > ay0)
-	{
-	  y0 = ay0;
-	  y1 = MIN (ay1, by1);
-	}
+      if (b1 > a0)
+        {
+          x0 = a0;
+          x1 = MIN (a1, b1);
+        }
       else
-	return 0;
+        return 0;
     }
 
-  return 0.25 * (x1*x1 - x0*x0) * (y1*y1 - y0*y0);
+  return (x1*x1 - x0*x0);
 }
 
 static void
 bilinear_make_weights (PixopsFilter *filter, double x_scale, double y_scale, double overall_alpha)
 {
-  int i_offset, j_offset;
-
-  int n_x = ceil(1/x_scale + 2.0);
-  int n_y = ceil(1/y_scale + 2.0);
+  double *y_pixel_weights;
+  double *x_pixel_weights;
+  double sqrt_alpha = sqrt(overall_alpha);
+  double x_constant = sqrt_alpha * x_scale;
+  double y_constant = sqrt_alpha * y_scale;
+  double w;
+  int n_x = ceil (1/x_scale + 2.0);
+  int n_y = ceil (1/y_scale + 2.0);
+  int offset, i;
 
   filter->x_offset = -1.0;
   filter->y_offset = -1.0;
   filter->n_x = n_x;
   filter->n_y = n_y;
-  
-  filter->weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);
-
-  for (i_offset=0; i_offset<SUBSAMPLE; i_offset++)
-    for (j_offset=0; j_offset<SUBSAMPLE; j_offset++)
-      {
-	int *pixel_weights = filter->weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
-	double x = (double)j_offset / SUBSAMPLE;
-	double y = (double)i_offset / SUBSAMPLE;
-	int i,j;
-	int total = 0;
-
-	for (i = 0; i < n_y; i++)
-	  for (j = 0; j < n_x; j++)
-	    {
-	      double w;
-	      int weight;
-	      
-	      w = bilinear_quadrant  (0.5 + j - (x + 1 / x_scale), 0.5 + j - x, 0.5 + i - (y + 1 / y_scale), 0.5 + i - y);
-	      w += bilinear_quadrant (1.5 + x - j, 1.5 + (x + 1 / x_scale) - j, 0.5 + i - (y + 1 / y_scale), 0.5 + i - y);
-	      w += bilinear_quadrant (0.5 + j - (x + 1 / x_scale), 0.5 + j - x, 1.5 + y - i, 1.5 + (y + 1 / y_scale) - i);
-	      w += bilinear_quadrant (1.5 + x - j, 1.5 + (x + 1 / x_scale) - j, 1.5 + y - i, 1.5 + (y + 1 / y_scale) - i);
-	      weight = 65536 * w * x_scale * y_scale * overall_alpha + 0.5;
-	      *(pixel_weights + n_x * i + j) = weight;
-	      total += weight;
-	    }
-	
-	correct_total (pixel_weights, n_x, n_y, total, overall_alpha);
-      }
+  filter->x_weights = g_new (double, SUBSAMPLE * n_x);
+  filter->y_weights = g_new (double, SUBSAMPLE * n_y);
+  filter->overall_alpha = overall_alpha;
+
+  x_pixel_weights = filter->x_weights;
+  y_pixel_weights = filter->y_weights;
+
+  for (offset=0; offset < SUBSAMPLE; offset++)
+    {
+      double x = (double)offset / SUBSAMPLE;
+      double a = x + 1 / x_scale;
+      double b = x + 1 / y_scale;
+
+      for (i = 0; i < n_x; i++)
+        {
+          w  = 0.5 * bilinear_quadrant (0.5 + i - a, 0.5 + i - x);
+          w += 0.5 * bilinear_quadrant (1.5 + x - i, 1.5 + a - i);
+      
+          *(x_pixel_weights++) = w * x_constant;
+        }
+
+      for (i = 0; i < n_y; i++)
+        {
+          w =  0.5 * bilinear_quadrant (0.5 + i - b, 0.5 + i - x);
+          w += 0.5 * bilinear_quadrant (1.5 + x - i, 1.5 + b - i);
+      
+          *(y_pixel_weights++) = w * y_constant;
+        }
+    }
 }
 
 void
@@ -1474,7 +1485,8 @@ pixops_composite_color (guchar         *
 		  src_has_alpha, scale_x, scale_y, check_x, check_y, check_size, color1, color2,
 		  &filter, line_func, composite_pixel_color);
 
-  g_free (filter.weights);
+  g_free (filter.x_weights);
+  g_free (filter.y_weights);
 }
 
 /**
@@ -1584,7 +1596,8 @@ pixops_composite (guchar        *dest_bu
 		  src_has_alpha, scale_x, scale_y, 0, 0, 0, 0, 0, 
 		  &filter, line_func, composite_pixel);
 
-  g_free (filter.weights);
+  g_free (filter.x_weights);
+  g_free (filter.y_weights);
 }
 
 void
@@ -1660,6 +1673,7 @@ pixops_scale (guchar        *dest_buf,
 		  src_has_alpha, scale_x, scale_y, 0, 0, 0, 0, 0,
 		  &filter, line_func, scale_pixel);
 
-  g_free (filter.weights);
+  g_free (filter.x_weights);
+  g_free (filter.y_weights);
 }
 
#include <stdio.h>
#include <stdlib.h>
#include <glib.h>
#include <math.h>

#undef       MIN
#define MIN(a, b)  (((a) < (b)) ? (a) : (b))
 
#define SUBSAMPLE_BITS 4
#define SUBSAMPLE (1 << SUBSAMPLE_BITS)
#define SUBSAMPLE_MASK ((1 << SUBSAMPLE_BITS)-1)
#define SCALE_SHIFT 16

#define DEBUG_PRINT_CORRECTION 0
#define DEBUG_PRINT_TABLE 0
#define DEBUG_PRINT_VECTOR 0

typedef struct _Pixops2dFilter Pixops2dFilter;
typedef struct _PixopsFilter PixopsFilter;

struct _Pixops2dFilter
{
  int *weights;
  int n_x;
  int n_y;
  double x_offset;
  double y_offset;
}; 

struct _PixopsFilter
{
  double *x_weights;
  double *y_weights;
  int n_x;
  int n_y;
  double x_offset;
  double y_offset;
  double overall_alpha;
}; 

/* Code common to all interpolation filters */

static void 
correct_total (int    *weights, 
	       int    n_x, 
	       int    n_y,
	       int    total, 
	       double overall_alpha)
{
  int correction = (int)(0.5 + 65536 * overall_alpha) - total;

  if (correction != 0)
    {
      int i;
      for (i = n_x * n_y - 1; i >= 0; i--) 
        {
          if (*(weights + i) + correction >= 0) 
            {
              *(weights + i) += correction;
              break;
            }
        }
    }
}

/* PIXOPS_INTERP_HYPER */

/* current code */

static double
bilinear_table_quadrant (double bx0, double bx1, double by0, double by1)
{
  double ax0, ax1, ay0, ay1;
  double x0, x1, y0, y1;

  ax0 = 0.;
  ax1 = 1.;
  ay0 = 0.;
  ay1 = 1.;

  if (ax0 < bx0)
    {
      if (ax1 > bx0)
	{
	  x0 = bx0;
	  x1 = MIN (ax1, bx1);
	}
      else
	return 0;
    }
  else
    {
      if (bx1 > ax0)
	{
	  x0 = ax0;
	  x1 = MIN (ax1, bx1);
	}
      else
	return 0;
    }

  if (ay0 < by0)
    {
      if (ay1 > by0)
	{
	  y0 = by0;
	  y1 = MIN (ay1, by1);
	}
      else
	return 0;
    }
  else
    {
      if (by1 > ay0)
	{
	  y0 = ay0;
	  y1 = MIN (ay1, by1);
	}
      else
	return 0;
    }

  return 0.25 * (x1*x1 - x0*x0) * (y1*y1 - y0*y0);
}

static void
bilinear_make_table_weights (Pixops2dFilter *filter, double x_scale, double y_scale, double overall_alpha)
{
  int i_offset, j_offset;

  int n_x = ceil (1/x_scale + 2.0);
  int n_y = ceil (1/y_scale + 2.0);

  filter->x_offset = -1.0;
  filter->y_offset = -1.0;
  filter->n_x = n_x;
  filter->n_y = n_y;
  
  filter->weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);

  for (i_offset=0; i_offset<SUBSAMPLE; i_offset++)
    for (j_offset=0; j_offset<SUBSAMPLE; j_offset++)
      {
	int *pixel_weights = filter->weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
	double x = (double)j_offset / SUBSAMPLE;
	double y = (double)i_offset / SUBSAMPLE;
	int i,j;
	int total = 0;

	for (i = 0; i < n_y; i++)
	  for (j = 0; j < n_x; j++)
	    {
	      double w;
	      int weight;
	      
	      w = bilinear_table_quadrant  (0.5 + j - (x + 1 / x_scale), 0.5 + j - x, 0.5 + i - (y + 1 / y_scale), 0.5 + i - y);
	      w += bilinear_table_quadrant (1.5 + x - j, 1.5 + (x + 1 / x_scale) - j, 0.5 + i - (y + 1 / y_scale), 0.5 + i - y);
	      w += bilinear_table_quadrant (0.5 + j - (x + 1 / x_scale), 0.5 + j - x, 1.5 + y - i, 1.5 + (y + 1 / y_scale) - i);
	      w += bilinear_table_quadrant (1.5 + x - j, 1.5 + (x + 1 / x_scale) - j, 1.5 + y - i, 1.5 + (y + 1 / y_scale) - i);
	      weight = 65536 * w * x_scale * y_scale * overall_alpha + 0.5;
	      *(pixel_weights + n_x * i + j) = weight;
	      total += weight;
	    }

	correct_total (pixel_weights, n_x, n_y, total, overall_alpha);
      }
}

/* new code */

static double
bilinear_quadrant (double b0, double b1)
{
  double a0, a1;
  double x0, x1;

  a0 = 0.;
  a1 = 1.;

  if (a0 < b0)
    {
      if (a1 > b0)
	{
	  x0 = b0;
	  x1 = MIN (a1, b1);
	}
      else
	return 0;
    }
  else
    {
      if (b1 > a0)
	{
	  x0 = a0;
	  x1 = MIN (a1, b1);
	}
      else
	return 0;
    }

  return (x1*x1 - x0*x0);
}

static void
bilinear_make_weights (PixopsFilter *filter, double x_scale, double y_scale, double overall_alpha)
{
  double *y_pixel_weights;
  double *x_pixel_weights;
  double sqrt_alpha = sqrt(overall_alpha);
  double x_constant = sqrt_alpha * x_scale;
  double y_constant = sqrt_alpha * y_scale;
  double w;
  int n_x = ceil (1/x_scale + 2.0);
  int n_y = ceil (1/y_scale + 2.0);
  int offset, i;

  filter->x_offset = -1.0;
  filter->y_offset = -1.0;
  filter->n_x = n_x;
  filter->n_y = n_y;
  filter->x_weights = g_new (double, SUBSAMPLE * n_x);
  filter->y_weights = g_new (double, SUBSAMPLE * n_y);
  filter->overall_alpha = overall_alpha;

  x_pixel_weights = filter->x_weights;
  y_pixel_weights = filter->y_weights;

  for (offset=0; offset < SUBSAMPLE; offset++)
    {
      double x = (double)offset / SUBSAMPLE;
      double a = x + 1 / x_scale;
      double b = x + 1 / y_scale;

      for (i = 0; i < n_x; i++)
        {
          w  = 0.5 * bilinear_quadrant (0.5 + i - a, 0.5 + i - x);
          w += 0.5 * bilinear_quadrant (1.5 + x - i, 1.5 + a - i);
      
          *(x_pixel_weights++) = w * x_constant;
        }

      for (i = 0; i < n_y; i++)
        {
          w =  0.5 * bilinear_quadrant (0.5 + i - b, 0.5 + i - x);
          w += 0.5 * bilinear_quadrant (1.5 + x - i, 1.5 + b - i);
      
          *(y_pixel_weights++) = w * y_constant;
        }
    }
}

/* PIXOPS_INTERP_BILINEAR */

/* current code */

static void
bilinear_make_fast_table_weights (Pixops2dFilter *filter, double x_scale, double y_scale, double overall_alpha)
{
  int i_offset, j_offset;
  double *x_weights, *y_weights;
  int n_x, n_y;

  if (x_scale > 1.0)		/* Bilinear */
    {
      n_x = 2;
      filter->x_offset = 0.5 * (1/x_scale - 1);
    }
  else				/* Tile */
    {
      n_x = ceil (1.0 + 1.0/x_scale);
      filter->x_offset = 0.0;
    }

  if (y_scale > 1.0)		/* Bilinear */
    {
      n_y = 2;
      filter->y_offset = 0.5 * (1/y_scale - 1);
    }
  else				/* Tile */
    {
      n_y = ceil (1.0 + 1.0/y_scale);
      filter->y_offset = 0.0;
    }

  filter->n_y = n_y;
  filter->n_x = n_x;
  filter->weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);

  x_weights = g_new (double, n_x);
  y_weights = g_new (double, n_y);

  for (i_offset=0; i_offset<SUBSAMPLE; i_offset++)
    for (j_offset=0; j_offset<SUBSAMPLE; j_offset++)
      {
	int *pixel_weights = filter->weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
	double x = (double)j_offset / SUBSAMPLE;
	double y = (double)i_offset / SUBSAMPLE;
	int i,j;
	int total = 0;

	if (x_scale > 1.0)	/* Bilinear */
	  {
	    for (i = 0; i < n_x; i++)
	      {
		x_weights[i] = ((i == 0) ? (1 - x) : x) / x_scale;
	      }
	  }
	else			/* Tile */
	  {
	    /*           x
	     * ---------|--.-|----|--.-|-------  SRC
	     * ------------|---------|---------  DEST
	     */
	    for (i = 0; i < n_x; i++)
	      {
		if (i < x)
		  {
		    if (i + 1 > x)
		      x_weights[i] = MIN (i+1, x + 1/x_scale) - x;
		    else
		      x_weights[i] = 0;
		  }
		else
		  {
		    if (x + 1/x_scale > i)
		      x_weights[i] = MIN (i+1, x + 1/x_scale) - i;
		    else
		      x_weights[i] = 0;
		  }
	      }
	  }

	if (y_scale > 1.0)	/* Bilinear */
	  {
	    for (i = 0; i < n_y; i++)
	      {
		y_weights[i] = ((i == 0) ? (1 - y) : y) / y_scale;
	      }
	  }
	else			/* Tile */
	  {
	    /*           y
	     * ---------|--.-|----|--.-|-------  SRC
	     * ------------|---------|---------  DEST
	     */
	    for (i = 0; i < n_y; i++)
	      {
		if (i < y)
		  {
		    if (i + 1 > y)
		      y_weights[i] = MIN (i+1, y + 1/y_scale) - y;
		    else
		      y_weights[i] = 0;
		  }
		else
		  {
		    if (y + 1/y_scale > i)
		      y_weights[i] = MIN (i+1, y + 1/y_scale) - i;
		    else
		      y_weights[i] = 0;
		  }
	      }
	  }

	for (i = 0; i < n_y; i++)
	  for (j = 0; j < n_x; j++)
	    {
	      int weight = 65536 * x_weights[j] * x_scale * y_weights[i] * y_scale * overall_alpha + 0.5;
	      *(pixel_weights + n_x * i + j) = weight;
	      total += weight;
	    }

	correct_total (pixel_weights, n_x, n_y, total, overall_alpha);
      }

  g_free (x_weights);
  g_free (y_weights);
}

/* new code */

static void
bilinear_make_fast_weights (PixopsFilter *filter, double x_scale, double y_scale, double overall_alpha)
{
  double *x_pixel_weights;
  double *y_pixel_weights;
  double sqrt_alpha = sqrt(overall_alpha);
  double x_constant = sqrt_alpha * x_scale;
  double y_constant = sqrt_alpha * y_scale;
  int n_x, n_y;
  int offset;
  int i;

  if (x_scale > 1.0)		/* Bilinear */
    {
      n_x = 2;
      filter->x_offset = 0.5 * (1 / x_scale - 1);
    }
  else				/* Tile */
    {
      n_x = ceil (1.0 + 1.0 / x_scale);
      filter->x_offset = 0.0;
    }

  if (y_scale > 1.0)		/* Bilinear */
    {
      n_y = 2;
      filter->y_offset = 0.5 * (1 / y_scale - 1);
    }
  else				/* Tile */
    {
      n_y = ceil (1.0 + 1.0 / y_scale);
      filter->y_offset = 0.0;
    }

  filter->n_y = n_y;
  filter->n_x = n_x;
  filter->x_weights = g_new (double, SUBSAMPLE * n_x);
  filter->y_weights = g_new (double, SUBSAMPLE * n_y);
  filter->overall_alpha = overall_alpha;

  x_pixel_weights = filter->x_weights;
  y_pixel_weights = filter->y_weights;

  for (offset=0; offset < SUBSAMPLE; offset++)
    {
      double x = (double)offset / SUBSAMPLE;

      if (x_scale > 1.0)      /* Bilinear */
        {
          for (i = 0; i < n_x; i++)
            *(x_pixel_weights++) = (((i == 0) ? (1 - x) : x) / x_scale) * x_constant;
        }
      else                  /* Tile */
        {
          double a = x + 1 / x_scale;

          /*           x
           * ---------|--.-|----|--.-|-------  SRC
           * ------------|---------|---------  DEST
           */
          for (i = 0; i < n_x; i++)
            {
              if (i < x)
                {
                  if (i + 1 > x)
                    *(x_pixel_weights++) = (MIN (i + 1, a) - x) * x_constant;
                  else
                    *(x_pixel_weights++) = 0;
                }
              else
                {
                  if (a > i)
                    *(x_pixel_weights++) = (MIN (i + 1, a) - i) * x_constant;
                  else
                    *(x_pixel_weights++) = 0;
                }
            }
        }

      if (y_scale > 1.0)      /* Bilinear */
        {
          for (i = 0; i < n_y; i++)
            *(y_pixel_weights++) = (((i == 0) ? (1 - x) : x) / y_scale) * y_constant;
        }
      else                  /* Tile */
        {
          double b = x + 1 / y_scale;

          /*           x
           * ---------|--.-|----|--.-|-------  SRC
           * ------------|---------|---------  DEST
           */
          for (i = 0; i < n_y; i++)
            {
              if (i < x)
                {
                  if (i + 1 > x)
                    *(y_pixel_weights++) = (MIN (i + 1, b) - x) * y_constant;
                  else
                    *(y_pixel_weights++) = 0;
                }
              else
                {
                  if (b > i)
                    *(y_pixel_weights++) = (MIN (i + 1, b) - i) * y_constant;
                  else
                    *(y_pixel_weights++) = 0;
                }
            }
        }
    }
}

/* PIXOPS_INTERP_TILE */

/* current code */

static void
tile_make_table_weights (Pixops2dFilter *filter, double x_scale, double y_scale, double overall_alpha)
{
  int i_offset, j_offset;

  int n_x = ceil (1/x_scale + 1);
  int n_y = ceil (1/y_scale + 1);

  filter->x_offset = 0;
  filter->y_offset = 0;
  filter->n_x = n_x;
  filter->n_y = n_y;
  filter->weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);

  for (i_offset=0; i_offset<SUBSAMPLE; i_offset++)
    for (j_offset=0; j_offset<SUBSAMPLE; j_offset++)
      {
	int *pixel_weights = filter->weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
	double x = (double)j_offset / SUBSAMPLE;
	double y = (double)i_offset / SUBSAMPLE;
	int i,j;
	int total = 0;
	  
	for (i = 0; i < n_y; i++)
	  {
	    double tw, th;
		
	    if (i < y)
	      {

		if (i + 1 > y)
		  th = MIN (i+1, y + 1/y_scale) - y;
		else
		  th = 0;
	      }
	    else
	      {
		if (y + 1/y_scale > i)
		  th = MIN (i+1, y + 1/y_scale) - i;
		else
		  th = 0;
	      }
		
	    for (j = 0; j < n_x; j++)
	      {
		int weight;
		
		if (j < x)
		  {
		    if (j + 1 > x)
		      tw = MIN (j+1, x + 1/x_scale) - x;
		    else
		      tw = 0;
		  }
		else
		  {
		    if (x + 1/x_scale > j)
		      tw = MIN (j+1, x + 1/x_scale) - j;
		    else
		      tw = 0;
		  }

		weight = 65536 * tw * x_scale * th * y_scale * overall_alpha + 0.5;
		total += weight;
		*(pixel_weights + n_x * i + j) = weight;
	      }
	  }
	
	correct_total (pixel_weights, n_x, n_y, total, overall_alpha);
      }
}

/* new code */

static void
tile_make_weights (PixopsFilter *filter, double x_scale, double y_scale, double overall_alpha)
{
  double *x_pixel_weights;
  double *y_pixel_weights;
  double sqrt_alpha = sqrt(overall_alpha);
  double x_constant = sqrt_alpha * x_scale;
  double y_constant = sqrt_alpha * y_scale;
  int n_x = ceil (1/x_scale + 1);
  int n_y = ceil (1/y_scale + 1);
  int offset;
  int i;

  filter->x_offset = 0;
  filter->y_offset = 0;
  filter->n_x = n_x;
  filter->n_y = n_y;
  filter->x_weights = g_new (double, SUBSAMPLE * n_x);
  filter->y_weights = g_new (double, SUBSAMPLE * n_y);
  filter->overall_alpha = overall_alpha;

  x_pixel_weights = filter->x_weights;
  y_pixel_weights = filter->y_weights;

  for (offset=0; offset < SUBSAMPLE; offset++)
    {
      double x = (double)offset / SUBSAMPLE;
      double a = x + 1 / x_scale;
      double b = x + 1 / y_scale;

      for (i = 0; i < n_x; i++)
        {
          if (i < x)
            {
              if (i + 1 > x)
                *(x_pixel_weights++)  = (MIN (i + 1, a) - x) * x_constant;
              else
                *(x_pixel_weights++) = 0;
            }
          else
            {
              if (a > i)
                *(x_pixel_weights++)  = (MIN (i + 1, a) - i) * x_constant;
              else
                *(x_pixel_weights++) = 0;
            }
       }
        
      for (i = 0; i < n_y; i++)
        {
          if (i < x)
            {
              if (i + 1 > x)
                *(y_pixel_weights++) = (MIN (i + 1, b) - x) * y_constant;
              else
                *(y_pixel_weights++) = 0;
            }
          else
            {
              if (b > i)
                *(y_pixel_weights++) = (MIN (i + 1, b) - i) * y_constant;
              else
                *(y_pixel_weights++) = 0;
            }
        }
    }
}

/* Test driver code */

void
print_vector(PixopsFilter *vector)
{
  int i, j, x, y;

  if (! DEBUG_PRINT_VECTOR)
     return;

  printf ("  vector->n_x = %d, vector->n_y = %d\n", vector->n_x, vector->n_y);
  printf ("  vector->x_offset = %f, vector->y_offset = %f\n",
          vector->x_offset, vector->y_offset);

  for (x=0; x < SUBSAMPLE; x++)
    {
      double sum = 0;
      printf ("  --- x_vector %d of %d---\n", x, SUBSAMPLE - 1);
      for (j = 0; j < vector->n_x; j++)
        {
          printf ("  %5.2f  ", vector->x_weights[(x * vector->n_x) + j]);
          sum += vector->x_weights[(x * vector->n_x) + j];
        }
      printf( "  sum is %5.2f\n", sum);
    }
  for (y=0; y < SUBSAMPLE; y++)
    {
      double sum = 0;
      printf ("  --- y_vector %d of %d---\n", y, SUBSAMPLE - 1);
      for (i = 0; i < vector->n_y; i++)
        {
          printf ("  %5.2f  ", vector->y_weights[(y * vector->n_y) + i]);
          sum +=  vector->y_weights[(y * vector->n_y) + i];
        }
      printf( "  sum is %5.2f\n", sum);
    }
}

static int *
convert_vector_to_table(PixopsFilter *vector)
{
  int i_offset, j_offset;
  int n_x = vector->n_x;
  int n_y = vector->n_y;
  int * weights = g_new (int, SUBSAMPLE * SUBSAMPLE * n_x * n_y);

  print_vector(vector);

  if (DEBUG_PRINT_CORRECTION)
    printf ("\n  ---\n");

  for (i_offset=0; i_offset < SUBSAMPLE; i_offset++)
    for (j_offset=0; j_offset < SUBSAMPLE; j_offset++)
      {
        double weight;
        int *pixel_weights = weights + ((i_offset*SUBSAMPLE) + j_offset) * n_x * n_y;
        int total = 0;
        int i,j;

        for (i=0; i < n_y; i++)
          for (j=0; j < n_x; j++)
            {
              weight = vector->x_weights[(j_offset * n_x) + j] *
                       vector->y_weights[(i_offset * n_y) + i] *
                       65536 + 0.5;

              total += (int)weight;
		
              *(pixel_weights + n_x * i + j) = weight;
            }

        if (DEBUG_PRINT_CORRECTION)
          {
            printf ("   Correcting table (%d %d) = %d\n", i_offset, j_offset,
                    (int)(0.5 + 65536 * vector->overall_alpha) - total);
          }
	
	correct_total (pixel_weights, n_x, n_y, total, vector->overall_alpha);
      }

  if (DEBUG_PRINT_CORRECTION)
    printf ("  ---\n\n");

  return weights;
}

void
print_table(Pixops2dFilter *filter)
{
  double w0;
  int i, j, x;

  if (! DEBUG_PRINT_TABLE)
     return;

  printf ("  filter->n_x = %d, filter->n_y = %d\n", filter->n_x, filter->n_y);
  printf ("  filter->x_offset = %f, filter->y_offset = %f\n",
          filter->x_offset, filter->y_offset);
  w0 = (double)filter->weights[0];

  for (x=0; x < SUBSAMPLE * SUBSAMPLE; x++)
    {
      printf ("  --- filter %d of %d---\n", x, SUBSAMPLE * SUBSAMPLE - 1);
      for (i = 0; i < filter->n_y; i++)
        {
          for (j = 0; j < filter->n_x; j++)
            printf ("  %5.2f  ", filter->weights[(i + (x * filter->n_y)) *
                    filter->n_x + j] / w0);

          printf( "\n");
        }
      printf( "\n");
    }
}

void
compare_filters (Pixops2dFilter *filter1, Pixops2dFilter *filter2)
{
  int x;
  int total_error_flag = 0;

  if (filter1->n_y != filter2->n_y || filter1->n_x != filter2->n_x)
     printf ("ERROR - Tables not same size\n");

  else
    {
      printf ("  --- filter check start ---\n");

      for (x=0; x < filter1->n_x * filter1->n_y * SUBSAMPLE * SUBSAMPLE; x++)
	{
	  if (filter1->weights[x] != filter2->weights[x])
            {
               printf ("ERROR - There is a problem with position %d %ld %ld\n",
		 x, filter1->weights[x], filter2->weights[x]);
		total_error_flag++;
            }
	}

      printf ("  --- filter check end ---\n");

      if (total_error_flag > 0)
        printf ("ERROR - %d total problems were found\n\n", total_error_flag);
      else
        printf ("  No problems were found\n\n");
    }
}

void do_hyper_test (double x_scale, double y_scale, double overall_alpha)
{
  Pixops2dFilter table;
  Pixops2dFilter converted_table;
  PixopsFilter   vector;

  bilinear_make_table_weights (&table, x_scale, y_scale, overall_alpha);
  printf ("x_scale = %f, y_scale = %f\n", x_scale, y_scale);
  print_table (&table);

  bilinear_make_weights (&vector, x_scale, y_scale, overall_alpha);
  printf ("x_scale = %f, y_scale = %f\n", x_scale, y_scale);

  converted_table.weights = convert_vector_to_table (&vector);
  converted_table.n_x = vector.n_x;
  converted_table.n_y = vector.n_y;
  converted_table.x_offset = vector.x_offset;
  converted_table.y_offset = vector.y_offset;

  print_table (&converted_table);

  compare_filters (&table, &converted_table);
}

void do_bilinear_test (double x_scale, double y_scale, double overall_alpha)
{
  Pixops2dFilter table;
  Pixops2dFilter converted_table;
  PixopsFilter   vector;

  bilinear_make_fast_table_weights (&table, x_scale, y_scale, overall_alpha);
  printf ("x_scale = %f, y_scale = %f\n", x_scale, y_scale);
  print_table (&table);

  bilinear_make_fast_weights (&vector, x_scale, y_scale, overall_alpha);
  printf ("x_scale = %f, y_scale = %f\n", x_scale, y_scale);

  converted_table.weights = convert_vector_to_table (&vector);
  converted_table.n_x = vector.n_x;
  converted_table.n_y = vector.n_y;
  converted_table.x_offset = vector.x_offset;
  converted_table.y_offset = vector.y_offset;

  print_table (&converted_table);

  compare_filters (&table, &converted_table);
}

void do_tile_test (double x_scale, double y_scale, double overall_alpha)
{
  Pixops2dFilter table;
  Pixops2dFilter converted_table;
  PixopsFilter   vector;

  tile_make_table_weights (&table, x_scale, y_scale, overall_alpha);
  printf ("  x_scale = %f, y_scale = %f\n", x_scale, y_scale);
  print_table (&table);

  tile_make_weights (&vector, x_scale, y_scale, overall_alpha);
  printf ("  x_scale = %f, y_scale = %f\n", x_scale, y_scale);

  converted_table.weights = convert_vector_to_table (&vector);
  converted_table.n_x = vector.n_x;
  converted_table.n_y = vector.n_y;
  converted_table.x_offset = vector.x_offset;
  converted_table.y_offset = vector.y_offset;

  print_table (&converted_table);

  compare_filters (&table, &converted_table);
}

int
main()
{
  printf ("\nDoing PIXOPS_INTERP_TILE tests:\n\n");
  do_tile_test (1.0,  1.0, 1.0);
  do_tile_test (0.5,  0.5, 1.0);
  do_tile_test (0.3,  0.3, 1.0);
  do_tile_test (0.25, 0.25, 1.0);
  do_tile_test (0.20, 0.20, 1.0);
  do_tile_test (0.15, 0.15, 1.0);
  do_tile_test (0.5,  1.0, 1.0);
  do_tile_test (0.5,  2.0, 1.0);
  do_tile_test (0.5,  3.0, 1.0);
  do_tile_test (0.5,  4.0, 1.0);
  do_tile_test (2.0,  2.0, 1.0);

  printf ("\nDoing PIXOPS_INTERP_BILINEAR tests:\n\n");
  do_bilinear_test (1.0,  1.0, 1.0);
  do_bilinear_test (0.5,  0.5, 1.0);
  do_bilinear_test (0.3,  0.3, 1.0);
  do_bilinear_test (0.25, 0.25, 1.0);
  do_bilinear_test (0.20, 0.20, 1.0);
  do_bilinear_test (0.15, 0.15, 1.0);
  do_bilinear_test (0.5,  1.0, 1.0);
  do_bilinear_test (0.5,  2.0, 1.0);
  do_bilinear_test (0.5,  3.0, 1.0);
  do_bilinear_test (0.5,  4.0, 1.0);
  do_bilinear_test (2.0,  2.0, 1.0);

  printf ("\nDoing PIXOPS_INTERP_HYPER tests:\n\n");
  do_hyper_test (1.0,  1.0, 1.0);
  do_hyper_test (0.5,  0.5, 1.0);
  do_hyper_test (0.3,  0.3, 1.0);
  do_hyper_test (0.25, 0.25, 1.0);
  do_hyper_test (0.20, 0.20, 1.0);
  do_hyper_test (0.15, 0.15, 1.0);
  do_hyper_test (0.5,  1.0, 1.0);
  do_hyper_test (0.5,  2.0, 1.0);
  do_hyper_test (0.5,  3.0, 1.0);
  do_hyper_test (0.5,  4.0, 1.0);
  do_hyper_test (2.0,  2.0, 1.0);

  exit (0);
}



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]