Re: Gsl FFT <-> FFTW compatibility
- From: timj gnu org
- To: Stefan Westerfeld <stefan space twc de>
- Cc: Beast Liste <beast gnome org>
- Subject: Re: Gsl FFT <-> FFTW compatibility
- Date: Mon, 16 May 2011 23:35:32 +0200 (CEST)
Hi Stefan,
thanks for working on this.
Attached you will find 4 patches which you can apply with git-am.
These are your fft branch patches rebased onto current master plus
documentation fixes where recent changes introduced conflicts.
Unfortunately the code can't be merged as is (thus this email)
because of the following issues:
* testfft.c - buffer overrun in line 200: memcpy (dft_in, ref_fft_in, MAX_FFT_SIZE * sizeof (dft_in[0]));
from glibc:
Checking reference fft for size 2: [*** buffer overflow detected ***: /opt/src/beast/bse/tests/.libs/lt-testfft terminated
This is possibly because MAX_DFT_SIZE != MAX_FFT_SIZE.
Why's there a MAX_DFT_SIZE in the first place?
* Why does gsl_power2_fftsc() contain this line?
/* { unsigned int i; for (i = 0; i < n_values << 1; i++) rivalues_out[i] *= (double) n_values; } */
* Why does gsl_power2_fftar(), an AnalysisReal function now contain
a call to a *synthesis* function? gsl_power2_fftsc
* Why does gsl_power2_fftsr(), a SynthesisReal function now contain a
call to an *analysis* function? gsl_power2_fft*analysis_skip2
For the latter two, should there possibly be some a<->s or analysis<->synthesis
renames in place, to fix the obvious consistency discrepancy?
On Wed, 13 Oct 2010, Stefan Westerfeld wrote:
Hi!
I've changed gslfft to produce the same results FFTW produces. This should make
it easy to optionally use FFTW to speed up gslfft, however this merge request
doesn't provide any optional FFTW support, but only prepares the code base for
that.
For the real variants, I've not rederived the formulas from scratch, but added
a minus sign (on the imaginary part) where required, to be able to use the old
code. Besides the extended testfft test, I've tested all variants against FFTW,
to ensure that the results are the same.
I've also grepped through all code for gslfft usage, and adapted it where
necessary. Make report also passes with the changes applied.
Here are the commit logs of the branch:
commit 618d89e0a0682be9869d6e091c0dc2a19fa19487
Author: Stefan Westerfeld <stefan space twc de>
Date: Wed Oct 13 11:23:50 2010 +0200
BSE: Adapted FFT tests to new gslfft/FFTW compatible results.
Added DFT test from #491577, which now passes due to new style gslfft results.
commit 1924ca83ecb71986aa5c2a27917957d6eabc3a68
Author: Stefan Westerfeld <stefan space twc de>
Date: Wed Oct 13 11:21:44 2010 +0200
BSE: Use new scaled variants of gslfft to get same results as before.
commit 3523cfaa2563b819f9401e48d8bb5d541918c923
Author: Stefan Westerfeld <stefan space twc de>
Date: Wed Oct 13 11:20:35 2010 +0200
BSE: Make gslfft results compatible with fftw.
The Numerical Recipies FFT and FFTW behave differently because a minus
sign in the formula is different (so Numerical Recipies analysis performs
the same computation like FFTW synthesis and vice versa). This change
makes gslfft and FFTW results equivalent. Also, while FFTW never scales
results, gslfft used to scale results during backward transform. To get
the same results, scaling is now only performed optionally in gslfft.
Finally, real variants (like fftar/fftsr) have been changed to produce the
same results like real FFTW.
To merge this change, use:
repo: http://space.twc.de/public/git/stwbeast.git
branch: gslfft-fftw-compat
This merge request replaces the old bug-491577 merge request.
Cu... Stefan
--
Stefan Westerfeld, Hamburg/Germany, http://space.twc.de/~stefan
_______________________________________________
beast mailing list
beast gnome org
http://mail.gnome.org/mailman/listinfo/beast
Yours sincerely,
Tim Janik
---
http://lanedo.com/~timj/ - Founder and CEO of Lanedo GmbH.
Free software author and contributor on various projects.
From 84ab5eb1f76517ba657c37df94f47919a38d7b56 Mon Sep 17 00:00:00 2001
From: Stefan Westerfeld <stefan space twc de>
Date: Wed, 13 Oct 2010 11:20:35 +0200
Subject: [PATCH 1/4] BSE: Make gslfft results compatible with fftw.
The Numerical Recipies FFT and FFTW behave differently because a minus
sign in the formula is different (so Numerical Recipies analysis performs
the same computation like FFTW synthesis and vice versa). This change
makes gslfft and FFTW results equivalent. Also, while FFTW never scales
results, gslfft used to scale results during backward transform. To get
the same results, scaling is now only performed optionally in gslfft.
Finally, real variants (like fftar/fftsr) have been changed to produce the
same results like real FFTW.
---
bse/gsl-fftconf.sh | 150 ++++++++++++++++++++++++++++++++++++++--------------
bse/gsl-fftgen.pl | 104 +++++++++++++++++++++++++++---------
bse/gslfft.h | 85 ++++++++++++++++++++++++------
3 files changed, 257 insertions(+), 82 deletions(-)
diff --git a/bse/gsl-fftconf.sh b/bse/gsl-fftconf.sh
index 109c66e..f4f97bf 100755
--- a/bse/gsl-fftconf.sh
+++ b/bse/gsl-fftconf.sh
@@ -29,7 +29,7 @@ $MKFFT $OPTIONS 0
#
# generate small fft sizes, seperating stage 2
#
-for dir in --analysis --synthesis ; do
+for dir in --analysis --synthesis --synthesis-scale ; do
DOPT="$OPTIONS --skip-macros $dir"
echo "Generating FFT functions: $dir" >&2
$MKFFT $DOPT 2 F # standalone fft2
@@ -68,31 +68,44 @@ cat <<__EOF
/* public FFT frontends and generic handling of huge fft sizes */
+typedef enum {
+ BIG_ANALYSIS = 1,
+ BIG_SYNTHESIS = 2,
+ BIG_SYNTHESIS_SCALE = 3
+} FFTMode;
static void
gsl_power2_fftc_big (const unsigned int n_values,
const $IEEE_TYPE *rivalues_in,
$IEEE_TYPE *rivalues,
- const int esign)
+ FFTMode mode)
{
+ const int esign = (mode == BIG_ANALYSIS) ? -1 : 1;
const unsigned int n_values2 = n_values << 1;
double theta = esign < 0 ? -3.1415926535897932384626433832795029 : 3.1415926535897932384626433832795029;
unsigned int i, block_size = 8192 << 1;
double last_sin;
- if (esign > 0)
+ if (mode == BIG_SYNTHESIS)
{
if (rivalues_in)
- bitreverse_fft2analysis (n_values, rivalues_in, rivalues);
+ bitreverse_fft2synthesis (n_values, rivalues_in, rivalues);
for (i = 0; i < n_values; i += 8192)
- gsl_power2_fft8192analysis_skip2 (rivalues + (i << 1), rivalues + (i << 1));
+ gsl_power2_fft8192synthesis_skip2 (rivalues + (i << 1), rivalues + (i << 1));
}
- else
+ else if (mode == BIG_SYNTHESIS_SCALE)
{
if (rivalues_in)
- bitreverse_fft2synthesis (n_values, rivalues_in, rivalues);
+ bitreverse_fft2synthesis_scale (n_values, rivalues_in, rivalues);
for (i = 0; i < n_values; i += 8192)
- gsl_power2_fft8192synthesis_skip2 (rivalues + (i << 1), rivalues + (i << 1));
+ gsl_power2_fft8192synthesis_scale_skip2 (rivalues + (i << 1), rivalues + (i << 1));
+ }
+ else if (mode == BIG_ANALYSIS)
+ {
+ if (rivalues_in)
+ bitreverse_fft2analysis (n_values, rivalues_in, rivalues);
+ for (i = 0; i < n_values; i += 8192)
+ gsl_power2_fft8192analysis_skip2 (rivalues + (i << 1), rivalues + (i << 1));
}
theta *= (double) 1.0 / 8192.;
last_sin = sin (theta);
@@ -231,7 +244,7 @@ gsl_power2_fftac (const unsigned int n_values,
case 2048: gsl_power2_fft2048analysis (rivalues_in, rivalues_out); break;
case 4096: gsl_power2_fft4096analysis (rivalues_in, rivalues_out); break;
case 8192: gsl_power2_fft8192analysis (rivalues_in, rivalues_out); break;
- default: gsl_power2_fftc_big (n_values, rivalues_in, rivalues_out, +1);
+ default: gsl_power2_fftc_big (n_values, rivalues_in, rivalues_out, BIG_ANALYSIS);
}
}
void
@@ -257,7 +270,34 @@ gsl_power2_fftsc (const unsigned int n_values,
case 2048: gsl_power2_fft2048synthesis (rivalues_in, rivalues_out); break;
case 4096: gsl_power2_fft4096synthesis (rivalues_in, rivalues_out); break;
case 8192: gsl_power2_fft8192synthesis (rivalues_in, rivalues_out); break;
- default: gsl_power2_fftc_big (n_values, rivalues_in, rivalues_out, -1);
+ default: gsl_power2_fftc_big (n_values, rivalues_in, rivalues_out, BIG_SYNTHESIS);
+ }
+ /* { unsigned int i; for (i = 0; i < n_values << 1; i++) rivalues_out[i] *= (double) n_values; } */
+}
+void
+gsl_power2_fftsc_scale (const unsigned int n_values,
+ const $IEEE_TYPE *rivalues_in,
+ $IEEE_TYPE *rivalues_out)
+{
+ g_return_if_fail ((n_values & (n_values - 1)) == 0 && n_values >= 1);
+
+ switch (n_values)
+ {
+ case 1: rivalues_out[0] = rivalues_in[0], rivalues_out[1] = rivalues_in[1]; break;
+ case 2: gsl_power2_fft2synthesis_scale (rivalues_in, rivalues_out); break;
+ case 4: gsl_power2_fft4synthesis_scale (rivalues_in, rivalues_out); break;
+ case 8: gsl_power2_fft8synthesis_scale (rivalues_in, rivalues_out); break;
+ case 16: gsl_power2_fft16synthesis_scale (rivalues_in, rivalues_out); break;
+ case 32: gsl_power2_fft32synthesis_scale (rivalues_in, rivalues_out); break;
+ case 64: gsl_power2_fft64synthesis_scale (rivalues_in, rivalues_out); break;
+ case 128: gsl_power2_fft128synthesis_scale (rivalues_in, rivalues_out); break;
+ case 256: gsl_power2_fft256synthesis_scale (rivalues_in, rivalues_out); break;
+ case 512: gsl_power2_fft512synthesis_scale (rivalues_in, rivalues_out); break;
+ case 1024: gsl_power2_fft1024synthesis_scale (rivalues_in, rivalues_out); break;
+ case 2048: gsl_power2_fft2048synthesis_scale (rivalues_in, rivalues_out); break;
+ case 4096: gsl_power2_fft4096synthesis_scale (rivalues_in, rivalues_out); break;
+ case 8192: gsl_power2_fft8192synthesis_scale (rivalues_in, rivalues_out); break;
+ default: gsl_power2_fftc_big (n_values, rivalues_in, rivalues_out, BIG_SYNTHESIS_SCALE);
}
/* { unsigned int i; for (i = 0; i < n_values << 1; i++) rivalues_out[i] *= (double) n_values; } */
}
@@ -280,7 +320,7 @@ gsl_power2_fftar (const unsigned int n_values,
g_return_if_fail ((n_values & (n_values - 1)) == 0 && n_values >= 2);
- gsl_power2_fftac (n_cvalues, r_values_in, rivalues_out);
+ gsl_power2_fftsc (n_cvalues, r_values_in, rivalues_out);
theta = 3.1415926535897932384626433832795029;
theta /= (double) n_cvalues;
@@ -309,12 +349,12 @@ gsl_power2_fftar (const unsigned int n_values,
H1im = F2im + F1re;
H1re = F1im - F2re;
H2re = F2re - F1im;
- H2im = H1im - FEim;
+ H2im = FEim - H1im;
H1re += FEre;
H2re += FEre;
H1im += FEim;
rivalues_out[i] = H1re;
- rivalues_out[i + 1] = H1im;
+ rivalues_out[i + 1] = -H1im;
rivalues_out[r] = H2re;
rivalues_out[r + 1] = H2im;
WMULTIPLY (Wre, Wim, Dre, Dim);
@@ -322,11 +362,14 @@ gsl_power2_fftar (const unsigned int n_values,
Dre = rivalues_out[0];
rivalues_out[0] = Dre + rivalues_out[1];
rivalues_out[1] = Dre - rivalues_out[1];
+ rivalues_out[n_cvalues + 1] = -rivalues_out[n_cvalues + 1];
}
-void
-gsl_power2_fftsr (const unsigned int n_values,
- const double *rivalues_in,
- double *r_values_out)
+
+static inline void
+gsl_power2_fftsr_impl (const unsigned int n_values,
+ const double *rivalues_in,
+ double *r_values_out,
+ gboolean need_scale)
{
unsigned int n_cvalues = n_values >> 1;
double Dre, Dim, Wre, Wim, theta, scale;
@@ -360,9 +403,8 @@ gsl_power2_fftsr (const unsigned int n_values,
ri |= j;
FOre = -FOre;
- FOim = -FOim;
FEre *= 0.5;
- FEim *= 0.5;
+ FEim *= -0.5;
F2re = FOre * Wim;
F2im = FOim * Wim;
F1re = FOre * Wre;
@@ -391,9 +433,16 @@ gsl_power2_fftsr (const unsigned int n_values,
if (n_values < 4)
return;
r_values_out[2] = rivalues_in[i];
- r_values_out[2 + 1] = rivalues_in[i + 1];
- scale = n_cvalues;
- scale = 1.0 / scale;
+ r_values_out[2 + 1] = -rivalues_in[i + 1];
+ if (need_scale)
+ {
+ scale = n_cvalues;
+ scale = 1 / scale;
+ }
+ else
+ {
+ scale = 2;
+ }
for (i = 0; i < n_values; i += 4)
BUTTERFLY_10scale (r_values_out[i], r_values_out[i + 1],
r_values_out[i + 2], r_values_out[i + 3],
@@ -403,21 +452,39 @@ gsl_power2_fftsr (const unsigned int n_values,
switch (n_cvalues)
{
case 2: break;
- case 4: gsl_power2_fft4synthesis_skip2 (NULL, r_values_out); break;
- case 8: gsl_power2_fft8synthesis_skip2 (NULL, r_values_out); break;
- case 16: gsl_power2_fft16synthesis_skip2 (NULL, r_values_out); break;
- case 32: gsl_power2_fft32synthesis_skip2 (NULL, r_values_out); break;
- case 64: gsl_power2_fft64synthesis_skip2 (NULL, r_values_out); break;
- case 128: gsl_power2_fft128synthesis_skip2 (NULL, r_values_out); break;
- case 256: gsl_power2_fft256synthesis_skip2 (NULL, r_values_out); break;
- case 512: gsl_power2_fft512synthesis_skip2 (NULL, r_values_out); break;
- case 1024: gsl_power2_fft1024synthesis_skip2 (NULL, r_values_out); break;
- case 2048: gsl_power2_fft2048synthesis_skip2 (NULL, r_values_out); break;
- case 4096: gsl_power2_fft4096synthesis_skip2 (NULL, r_values_out); break;
- case 8192: gsl_power2_fft8192synthesis_skip2 (NULL, r_values_out); break;
- default: gsl_power2_fftc_big (n_cvalues, NULL, r_values_out, -1);
+ case 4: gsl_power2_fft4analysis_skip2 (NULL, r_values_out); break;
+ case 8: gsl_power2_fft8analysis_skip2 (NULL, r_values_out); break;
+ case 16: gsl_power2_fft16analysis_skip2 (NULL, r_values_out); break;
+ case 32: gsl_power2_fft32analysis_skip2 (NULL, r_values_out); break;
+ case 64: gsl_power2_fft64analysis_skip2 (NULL, r_values_out); break;
+ case 128: gsl_power2_fft128analysis_skip2 (NULL, r_values_out); break;
+ case 256: gsl_power2_fft256analysis_skip2 (NULL, r_values_out); break;
+ case 512: gsl_power2_fft512analysis_skip2 (NULL, r_values_out); break;
+ case 1024: gsl_power2_fft1024analysis_skip2 (NULL, r_values_out); break;
+ case 2048: gsl_power2_fft2048analysis_skip2 (NULL, r_values_out); break;
+ case 4096: gsl_power2_fft4096analysis_skip2 (NULL, r_values_out); break;
+ case 8192: gsl_power2_fft8192analysis_skip2 (NULL, r_values_out); break;
+ default: gsl_power2_fftc_big (n_cvalues, NULL, r_values_out, BIG_ANALYSIS);
}
}
+
+void
+gsl_power2_fftsr (const unsigned int n_values,
+ const double *rivalues_in,
+ double *r_values_out)
+{
+ gsl_power2_fftsr_impl (n_values, rivalues_in, r_values_out, FALSE);
+}
+
+void
+gsl_power2_fftsr_scale (const unsigned int n_values,
+ const double *rivalues_in,
+ double *r_values_out)
+{
+ gsl_power2_fftsr_impl (n_values, rivalues_in, r_values_out, TRUE);
+}
+
+
void
gsl_power2_fftar_simple (const unsigned int n_values,
const float *real_values,
@@ -442,10 +509,14 @@ gsl_power2_fftar_simple (const unsigned int n_values,
complex_values[n_values + 1] = 0.0;
g_free (rv);
}
+__EOF
+for FUNC_NAME in fftsr fftsr_scale
+do
+cat << __EOF
void
-gsl_power2_fftsr_simple (const unsigned int n_values,
- const float *complex_values,
- float *real_values)
+gsl_power2_${FUNC_NAME}_simple (const unsigned int n_values,
+ const float *complex_values,
+ float *real_values)
{
double *cv, *rv;
guint i;
@@ -458,10 +529,11 @@ gsl_power2_fftsr_simple (const unsigned int n_values,
while (i--)
cv[i] = complex_values[i];
cv[1] = complex_values[n_values];
- gsl_power2_fftsr (n_values, cv, rv);
+ gsl_power2_$FUNC_NAME (n_values, cv, rv);
i = n_values;
while (i--)
real_values[i] = rv[i];
g_free (cv);
}
__EOF
+done
diff --git a/bse/gsl-fftgen.pl b/bse/gsl-fftgen.pl
index 2d79d11..26ea79c 100755
--- a/bse/gsl-fftgen.pl
+++ b/bse/gsl-fftgen.pl
@@ -32,7 +32,9 @@ my $min_split = 2048;
my $min_compress = 512;
my $fft2loop = 0;
my $single_stage = 0;
-my $negate_sign = 0;
+my $negate_sign = 1;
+my $func_name = "analysis";
+my $scale = 0;
my $Wtest = 0;
#
@@ -42,8 +44,9 @@ while ($_ = $ARGV[0], defined $_ && /^-/) {
shift;
last if (/^--$/);
if (/^--fft2loop$/) { $fft2loop = 1 }
- elsif (/^--analysis$/) { $negate_sign = 0 }
- elsif (/^--synthesis$/) { $negate_sign = 1 }
+ elsif (/^--analysis$/) { $negate_sign = 1; $func_name = "analysis"; $scale = 0; }
+ elsif (/^--synthesis$/) { $negate_sign = 0; $func_name = "synthesis"; $scale = 0; }
+ elsif (/^--synthesis-scale$/) { $negate_sign = 0; $func_name = "synthesis_scale"; $scale = 1; }
elsif (/^--Wtest$/) { $Wtest = 1 }
elsif (/^--double$/) { $ieee_type = "double" }
elsif (/^--skip-macros$/) { $gen_macros = 0 }
@@ -52,6 +55,7 @@ while ($_ = $ARGV[0], defined $_ && /^-/) {
elsif (/^--min-compress$/) { $min_compress = shift }
elsif (/^--single-stage$/) { $single_stage = shift }
}
+
# parse arguments
my @arguments = 0;
if (defined $ARGV[0]) {
@@ -217,18 +221,18 @@ sub unroll_stage {
my $offset2 = $offset1 + $n_points;
my $offset1r = bitreverse ($fft_size, $offset1 >> 1) << 1;
my $offset2r = bitreverse ($fft_size, $offset2 >> 1) << 1;
- my $scale = !$negate_sign ? "__1, __0" : sprintf "1.0 / (%s) %u", $tmp_ieee_type, $fft_size;
+ my $scale_args = ! $scale ? "__1, __0" : sprintf "1.0 / (%s) %u", $tmp_ieee_type, $fft_size;
printf($indent."BUTTERFLY_10%s (X[%s], X[%s + 1],\n".
$indent." X[%s], X[%s + 1],\n".
$indent." Y[%s], Y[%s + 1],\n".
$indent." Y[%s], Y[%s + 1],\n".
$indent." %s);\n",
- $negate_sign ? "scale" : "",
+ $scale ? "scale" : "",
$offset1r, $offset1r,
$offset2r, $offset2r,
$offset1, $offset1,
$offset2, $offset2,
- $scale);
+ $scale_args);
}
}
} elsif ($unroll_outer) {
@@ -325,7 +329,7 @@ sub bitreverse_fft2 {
# mul_result = gsl_complex (c1.re * c2.re - c1.im * c2.im, c1.re * c2.im + c1.im * c2.re);
printf "
static inline void
-bitreverse_fft2analysis (const unsigned int n,
+bitreverse_fft2synthesis (const unsigned int n,
const %-6s *X,
%-6s *Y)
{
@@ -370,25 +374,26 @@ bitreverse_fft2analysis (const unsigned int n,
}
}
static inline void
-bitreverse_fft2synthesis (const unsigned int n,
- const %-6s *X,
- %-6s *Y)
+bitreverse_fft2synthesis_scale (const unsigned int n,
+ const %-6s *X,
+ %-6s *Y)
{
const unsigned int n2 = n >> 1, n1 = n + n2, max = n >> 2;
unsigned int i, r;
%s scale = n;
scale = 1.0 / scale;
+
BUTTERFLY_10scale (X[0], X[1],
X[n], X[n + 1],
Y[0], Y[1],
Y[2], Y[3],
scale);
BUTTERFLY_10scale (X[n2], X[n2 + 1],
- X[n1], X[n1 + 1],
- Y[4], Y[5],
- Y[6], Y[7],
- scale);
+ X[n1], X[n1 + 1],
+ Y[4], Y[5],
+ Y[6], Y[7],
+ scale);
for (i = 1, r = 0; i < max; i++)
{
unsigned int k, j = n >> 1;
@@ -403,20 +408,65 @@ bitreverse_fft2synthesis (const unsigned int n,
k = r >> 1;
j = i << 3;
BUTTERFLY_10scale (X[k], X[k + 1],
- X[k + n], X[k + n + 1],
- Y[j], Y[j + 1],
- Y[j + 2], Y[j + 3],
- scale);
+ X[k + n], X[k + n + 1],
+ Y[j], Y[j + 1],
+ Y[j + 2], Y[j + 3],
+ scale);
k += n2;
j += 4;
BUTTERFLY_10scale (X[k], X[k + 1],
- X[k + n], X[k + n + 1],
- Y[j], Y[j + 1],
- Y[j + 2], Y[j + 3],
- scale);
+ X[k + n], X[k + n + 1],
+ Y[j], Y[j + 1],
+ Y[j + 2], Y[j + 3],
+ scale);
+ }
+}
+static inline void
+bitreverse_fft2analysis (const unsigned int n,
+ const %-6s *X,
+ %-6s *Y)
+{
+ const unsigned int n2 = n >> 1, n1 = n + n2, max = n >> 2;
+ unsigned int i, r;
+
+ BUTTERFLY_10 (X[0], X[1],
+ X[n], X[n + 1],
+ Y[0], Y[1],
+ Y[2], Y[3],
+ __0, __1);
+ BUTTERFLY_10 (X[n2], X[n2 + 1],
+ X[n1], X[n1 + 1],
+ Y[4], Y[5],
+ Y[6], Y[7],
+ __0, __1);
+ for (i = 1, r = 0; i < max; i++)
+ {
+ unsigned int k, j = n >> 1;
+
+ while (r >= j)
+ {
+ r -= j;
+ j >>= 1;
+ }
+ r |= j;
+
+ k = r >> 1;
+ j = i << 3;
+ BUTTERFLY_10 (X[k], X[k + 1],
+ X[k + n], X[k + n + 1],
+ Y[j], Y[j + 1],
+ Y[j + 2], Y[j + 3],
+ __0, __1);
+ k += n2;
+ j += 4;
+ BUTTERFLY_10 (X[k], X[k + 1],
+ X[k + n], X[k + n + 1],
+ Y[j], Y[j + 1],
+ Y[j + 2], Y[j + 3],
+ __0, __1);
}
}
-", $ieee_type, $ieee_type, $ieee_type, $ieee_type, $tmp_ieee_type;
+", $ieee_type, $ieee_type, $ieee_type, $ieee_type, $tmp_ieee_type, $ieee_type, $ieee_type, $tmp_ieee_type;
# testing:
# define BUTTERFLY_10(X1re,X1im,X2re,X2im,Y1re,Y1im,Y2re,Y2im,Wre,Wim,T1re,T1im,T2re,T2im) \
@@ -698,7 +748,7 @@ sub gen_stage {
if ($n_points == 2) { # input stage needs special handling
if ($kind eq 'L') {
printf "\n%s/* perform fft2 and bitreverse input */\n", $indent;
- printf "%sbitreverse_fft2%s (%u, X, Y);\n", $indent, $negate_sign ? "synthesis" : "analysis", $fft_size;
+ printf "%sbitreverse_fft2%s (%u, X, Y);\n", $indent, $func_name, $fft_size;
} elsif ($kind eq 'F') {
printf "\n%s/* perform %u times fft2 */\n", $indent, $times;
unroll_stage ($fft_size, $n_points, $times, 1);
@@ -723,10 +773,10 @@ sub gen_stage {
for (my $i = 0; $i < $times; $i++) {
if ($i) {
printf($indent."gsl_power2_fft%u%s_skip2 (X + %u, Y + %u);\n",
- $n_points, $negate_sign ? "synthesis" : "analysis", $n_points * $i << 1, $n_points * $i << 1);
+ $n_points, $func_name, $n_points * $i << 1, $n_points * $i << 1);
} else {
printf($indent."gsl_power2_fft%u%s_skip2 (X, Y);\n",
- $n_points, $negate_sign ? "synthesis" : "analysis");
+ $n_points, $func_name);
}
}
} else {
@@ -786,7 +836,7 @@ print " */\n";
printf "static void\n";
printf("gsl_power2_fft%u%s%s (const %s *X, %s *Y)\n{\n",
- $fft_size, $negate_sign ? "synthesis" : "analysis",
+ $fft_size, $func_name,
$skip2 ? "_skip2" : "",
$ieee_type, $ieee_type);
printf "%sregister unsigned int butterfly, block, offset;\n", $indent;
diff --git a/bse/gslfft.h b/bse/gslfft.h
index f440f63..5ba154b 100644
--- a/bse/gslfft.h
+++ b/bse/gslfft.h
@@ -42,15 +42,18 @@ extern "C" {
* equivalent).
*
* In general for the gsl_power2_fft*() family of functions, normalization is
- * only performed during backward transform. However, a popular mathematical
- * strategy of defining the FFT and IFFT in a way that the formulas are
- * symmetric is normalizing both, the forward and backward transform with
- * 1/sqrt(N) - where N is the number of complex values (n_values).
+ * only performed during backward transform if the gsl_power2_fftsc_scale()
+ * is used. No normalization is performed if gsl_power2_fftsc() is used.
+ *
+ * However, a popular mathematical strategy of defining the FFT and IFFT in a
+ * way that the formulas are symmetric is normalizing both, the forward and
+ * backward transform with 1/sqrt(N) - where N is the number of complex values
+ * (n_values).
*
* Compared to the above definition, in this implementation, the analyzed
* values produced by gsl_power2_fftac()/gsl_power2_fftar() will be too large
* by a factor of sqrt(N), which however are cancelled out on the backward
- * transform.
+ * transform (for _scale variants).
*
* Note that the transformation is performed out of place, the input
* array is not modified, and may not overlap with the output array.
@@ -69,11 +72,10 @@ void gsl_power2_fftac (const unsigned int n_values,
* represents the counterpart to gsl_power2_fftac(), that is, a value
* array which is transformed into the frequency domain with
* gsl_power2_fftac() can be reconstructed by issuing gsl_power2_fftsc()
- * on the transform.
+ * on the transform. This function does not perform scaling, so calling
+ * gsl_power2_fftac() and gsl_power2_fftsc() will scale the data with a factor
+ * of n_values. See also gsl_power2_fftsc_scale().
*
- * This function also scales the time domain coefficients by a
- * factor of 1.0/n_values which is required for perfect reconstruction
- * of time domain data formerly transformed via gsl_power2_fftac().
* More details on normalization can be found in the documentation of
* gsl_power2_fftac().
*
@@ -85,10 +87,34 @@ void gsl_power2_fftsc (const unsigned int n_values,
double *ri_values_out);
/**
- * @param n_values Number of real sample values
- * @param r_values_in Real sample values [0..n_values-1]
- * @param ri_values_out Complex frequency values [0..n_values-1]
+ * gsl_power2_fftsc_scale
+ * @n_values: Number of complex values
+ * @ri_values_in: Complex frequency values [0..n_values*2-1]
+ * @ri_values_out: Complex sample values [0..n_values*2-1]
+ * This function performs a decimation in time fourier transformation
+ * in backwards direction with normalization. As such, this function
+ * represents the counterpart to gsl_power2_fftac(), that is, a value
+ * array which is transformed into the frequency domain with
+ * gsl_power2_fftac() can be reconstructed by issuing gsl_power2_fftsc()
+ * on the transform.
*
+ * This function also scales the time domain coefficients by a
+ * factor of 1.0/n_values which is required for perfect reconstruction
+ * of time domain data formerly transformed via gsl_power2_fftac().
+ * More details on normalization can be found in the documentation of
+ * gsl_power2_fftac().
+ *
+ * Note that the transformation is performed out of place, the input
+ * array is not modified, and may not overlap with the output array.
+ */
+void gsl_power2_fftsc_scale (const unsigned int n_values,
+ const double *ri_values_in,
+ double *ri_values_out);
+/**
+ * gsl_power2_fftar
+ * @n_values: Number of real sample values
+ * @r_values_in: Real sample values [0..n_values-1]
+ * @ri_values_out: Complex frequency values [0..n_values-1]
* Real valued variant of gsl_power2_fftac(), the input array contains
* real valued equidistant sampled data [0..n_values-1], and the output
* array contains the positive frequency half of the complex valued
@@ -124,6 +150,30 @@ void gsl_power2_fftar (const unsigned int n_values,
* A real valued data set transformed into the frequency domain
* with gsl_power2_fftar() can be reconstructed using this function.
*
+ * This function does not perform normalization, so data that is transformed
+ * back from gsl_power2_fftar() will be scaled by a factor of n_values. See
+ * also gsl_power2_fftsr_scale().
+ *
+ * More details on normalization can be found in the documentation of
+ * gsl_power2_fftac().
+ *
+ * Note that the transformation is performed out of place, the input
+ * array is not modified, and may not overlap with the output array.
+ */
+void gsl_power2_fftsr (const unsigned int n_values,
+ const double *ri_values_in,
+ double *r_values_out);
+
+/**
+ * gsl_power2_fftsr_scale
+ * @n_values: Number of real sample values
+ * @ri_values_in: Complex frequency values [0..n_values-1]
+ * @r_values_out: Real sample values [0..n_values-1]
+ * Real valued variant of gsl_power2_fftsc(), counterpart to
+ * gsl_power2_fftar(), using the same frequency storage format.
+ * A real valued data set transformed into the frequency domain
+ * with gsl_power2_fftar() can be reconstructed using this function.
+ *
* This function also scales the time domain coefficients by a
* factor of 1.0/(n_values/2) which is required for perfect
* reconstruction of time domain data formerly transformed via
@@ -134,9 +184,9 @@ void gsl_power2_fftar (const unsigned int n_values,
* Note that the transformation is performed out of place, the input
* array is not modified, and may not overlap with the output array.
*/
-void gsl_power2_fftsr (const unsigned int n_values,
- const double *ri_values_in,
- double *r_values_out);
+void gsl_power2_fftsr_scale (const unsigned int n_values,
+ const double *ri_values_in,
+ double *r_values_out);
/* --- convenience wrappers --- */
@@ -146,7 +196,10 @@ void gsl_power2_fftar_simple (const unsigned int n_values,
void gsl_power2_fftsr_simple (const unsigned int n_values,
const float *complex_values,
float *real_values);
-
+void gsl_power2_fftsr_scale_simple (const unsigned int n_values,
+ const float *complex_values,
+ float *real_values);
+
#ifdef __cplusplus
--
1.7.1
From 3b90a223d49949d8a6a44fb48f7c3c00156007b6 Mon Sep 17 00:00:00 2001
From: Stefan Westerfeld <stefan space twc de>
Date: Wed, 13 Oct 2010 11:21:44 +0200
Subject: [PATCH 2/4] BSE: Use new scaled variants of gslfft to get same results as before.
---
bse/gslfilter.c | 2 +-
bse/gslosctable.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/bse/gslfilter.c b/bse/gslfilter.c
index d9c3016..698a0bc 100644
--- a/bse/gslfilter.c
+++ b/bse/gslfilter.c
@@ -921,7 +921,7 @@ gsl_filter_fir_approx (unsigned int iorder,
fft_in[1] = val;
}
- gsl_power2_fftsr (fft_size, fft_in, fft_out);
+ gsl_power2_fftsr_scale (fft_size, fft_in, fft_out);
for (i = 0; i <= iorder / 2; i++)
{
diff --git a/bse/gslosctable.c b/bse/gslosctable.c
index f6beb90..b75866e 100644
--- a/bse/gslosctable.c
+++ b/bse/gslosctable.c
@@ -299,7 +299,7 @@ cache_table_ref_entry (GslOscWaveForm wave_form,
gsl_power2_fftar_simple (e->n_values, values, fft);
step = e->mfreq * (gdouble) e->n_values;
fft_filter (e->n_values, fft, step, filter_func);
- gsl_power2_fftsr_simple (e->n_values, fft, values);
+ gsl_power2_fftsr_scale_simple (e->n_values, fft, values);
g_free (fft);
gsl_osc_wave_normalize (e->n_values, values, (min + max) / 2, max);
--
1.7.1
From b92c2f046b705fb5e66ff869848b72fafde0cdd9 Mon Sep 17 00:00:00 2001
From: Stefan Westerfeld <stefan space twc de>
Date: Wed, 13 Oct 2010 11:23:50 +0200
Subject: [PATCH 3/4] BSE: Adapted FFT tests to new gslfft/FFTW compatible results.
Added DFT test from #491577, which now passes due to new style gslfft results.
---
bse/tests/testfft.c | 183 +++++++++++++++++++++++++++++++++++++++++++++++----
1 files changed, 169 insertions(+), 14 deletions(-)
diff --git a/bse/tests/testfft.c b/bse/tests/testfft.c
index 4eb6793..86b1dc3 100644
--- a/bse/tests/testfft.c
+++ b/bse/tests/testfft.c
@@ -24,23 +24,37 @@
#include <stdlib.h>
#include <string.h>
-
#define MAX_FFT_SIZE (65536 * 2) // * 8 * 8
+#define MAX_DFT_SIZE (1024 * 2) // * 8 * 8
#define EPSILON (4.8e-6)
+#define REF_ANALYSIS (-1)
+#define REF_SYNTHESIS (1)
/* --- prototypes --- */
static void reference_power2_fftc (unsigned int n_values,
const double *rivalues_in,
double *rivalues_out,
int esign);
+static void reference_dftc (unsigned int n_values,
+ const double *rivalues_in,
+ double *rivalues_out);
static void fill_rand (guint n,
double *a);
+static void scale_block (guint n,
+ double *a,
+ double factor);
static double diff (guint m,
guint p,
double *a1,
double *a2,
const gchar *str);
+static void make_real (guint n,
+ double *a);
+static void extract_real (guint n,
+ const double *a,
+ double *b);
+
/* --- functions --- */
@@ -58,17 +72,18 @@ main (int argc,
gettimeofday (&tv, NULL);
srand (tv.tv_sec ^ tv.tv_usec);
- double ref_fft_in[MAX_FFT_SIZE] = { 0, };
- double ref_fft_aout[MAX_FFT_SIZE] = { 0, };
- double ref_fft_sout[MAX_FFT_SIZE] = { 0, };
- double ref_fft_back[MAX_FFT_SIZE] = { 0, };
- double work_fft_in[MAX_FFT_SIZE] = { 0, };
- double work_fft_aout[MAX_FFT_SIZE] = { 0, };
- double work_fft_sout[MAX_FFT_SIZE] = { 0, };
- double work_fft_back[MAX_FFT_SIZE] = { 0, };
+ static double ref_fft_in[MAX_FFT_SIZE] = { 0, };
+ static double ref_fft_aout[MAX_FFT_SIZE] = { 0, };
+ static double ref_fft_sout[MAX_FFT_SIZE] = { 0, };
+ static double ref_fft_back[MAX_FFT_SIZE] = { 0, };
+ static double work_fft_in[MAX_FFT_SIZE] = { 0, };
+ static double work_fft_aout[MAX_FFT_SIZE] = { 0, };
+ static double work_fft_sout[MAX_FFT_SIZE] = { 0, };
+ static double work_fft_back[MAX_FFT_SIZE] = { 0, };
+ static double scaled_fft_back[MAX_FFT_SIZE] = { 0, };
/* run tests */
- for (i = 2; i <= MAX_FFT_SIZE >> 1; i <<= 1)
+ for (i = 8; i <= MAX_FFT_SIZE >> 1; i <<= 1)
{
double d;
@@ -83,14 +98,17 @@ main (int argc,
// memset (work_fft_aout, 0, MAX_FFT_SIZE * sizeof (work_fft_aout[0]));
// memset (work_fft_sout, 0, MAX_FFT_SIZE * sizeof (work_fft_sout[0]));
// memset (work_fft_back, 0, MAX_FFT_SIZE * sizeof (work_fft_sout[0]));
- reference_power2_fftc (i, ref_fft_in, ref_fft_aout, +1);
- reference_power2_fftc (i, ref_fft_in, ref_fft_sout, -1);
- reference_power2_fftc (i, ref_fft_aout, ref_fft_back, -1);
+ reference_power2_fftc (i, ref_fft_in, ref_fft_aout, REF_ANALYSIS);
+ reference_power2_fftc (i, ref_fft_in, ref_fft_sout, REF_SYNTHESIS);
+ reference_power2_fftc (i, ref_fft_aout, ref_fft_back, REF_SYNTHESIS);
+ scale_block (i << 1, ref_fft_back, 1.0 / i);
/* perform fft test */
gsl_power2_fftac (i, work_fft_in, work_fft_aout);
gsl_power2_fftsc (i, work_fft_in, work_fft_sout);
gsl_power2_fftsc (i, work_fft_aout, work_fft_back);
+ scale_block (i << 1, work_fft_back, 1.0 / i);
+ gsl_power2_fftsc_scale (i, work_fft_aout, scaled_fft_back);
/* check differences */
d = diff (i << 1, 0, ref_fft_in, work_fft_in, "Checking input record");
@@ -118,11 +136,86 @@ main (int argc,
TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
else
TOK();
+ d = diff (i << 1, 0, work_fft_in, scaled_fft_back, "GSL analysis and scaled re-synthesis");
+ if (fabs (d) > EPSILON)
+ TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
+ else
+ TOK();
d = diff (i << 1, 0, ref_fft_back, work_fft_back, "Reference re-synthesis vs. GSL");
if (fabs (d) > EPSILON)
TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
else
TOK();
+ d = diff (i << 1, 0, ref_fft_back, scaled_fft_back, "Reference re-synthesis vs. scaled GSL");
+ if (fabs (d) > EPSILON)
+ TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
+ else
+ TOK();
+
+ /* test with real data */
+ make_real (i << 1, ref_fft_in);
+ extract_real (i << 1, ref_fft_in, work_fft_in);
+ reference_power2_fftc (i, ref_fft_in, ref_fft_aout, REF_ANALYSIS);
+ ref_fft_aout[1] = ref_fft_aout[i]; /* special packing for purely real FFTs */
+
+ /* perform real fft test */
+ gsl_power2_fftar (i, work_fft_in, work_fft_aout);
+ gsl_power2_fftsr (i, work_fft_aout, work_fft_back);
+ scale_block (i, work_fft_back, 1.0 / i);
+ gsl_power2_fftsr_scale (i, work_fft_aout, scaled_fft_back);
+
+ d = diff (i, 0, ref_fft_aout, work_fft_aout, "Reference real analysis vs. real GSL");
+ if (fabs (d) > EPSILON)
+ TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
+ else
+ TOK();
+
+ d = diff (i, 0, work_fft_in, scaled_fft_back, "Real input vs. scaled real GSL resynthesis");
+ if (fabs (d) > EPSILON)
+ TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
+ else
+ TOK();
+
+ d = diff (i, 0, work_fft_in, work_fft_back, "Real input vs. real GSL resynthesis");
+ if (fabs (d) > EPSILON)
+ TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
+ else
+ TOK();
+
+ TDONE();
+ }
+
+ static double dft_in[MAX_DFT_SIZE] = { 0, };
+ static double dft_aout[MAX_DFT_SIZE] = { 0, };
+
+ /* test reference fft against reference dft */
+ for (i = 2; i <= MAX_DFT_SIZE >> 1; i <<= 1)
+ {
+ double d;
+
+ TSTART ("Checking reference fft for size %u", i);
+
+ /* setup reference and work fft records */
+ fill_rand (i << 1, ref_fft_in);
+ memcpy (dft_in, ref_fft_in, MAX_FFT_SIZE * sizeof (dft_in[0]));
+
+ reference_power2_fftc (i, ref_fft_in, ref_fft_aout, REF_ANALYSIS);
+ reference_dftc (i, dft_in, dft_aout);
+
+
+ /* check differences */
+ d = diff (i << 1, 0, ref_fft_in, dft_in, "Checking input record");
+ if (d)
+ TERROR ("Input record was modified");
+ else
+ TOK();
+
+ d = diff (i << 1, 0, ref_fft_aout, dft_aout, "Reference FFT analysis against reference DFT analysis");
+ if (fabs (d) > EPSILON)
+ TERROR ("Error sum in analysis FFT exceeds epsilon: %g > %g", d, EPSILON);
+ else
+ TOK();
+
TDONE();
}
@@ -137,6 +230,35 @@ fill_rand (guint n,
a[n] = -1. + 2. * rand() / (RAND_MAX + 1.0);
}
+static void
+make_real (guint n,
+ double *a)
+{
+ guint x;
+ for (x = 1; x < n; x += 2)
+ a[x] = 0; /* eliminate complex part */
+}
+
+static void
+extract_real (guint n,
+ const double *a,
+ double *b)
+{
+ guint x;
+ for (x = 0; x < n; x += 2)
+ *b++ = a[x]; /* extract real part */
+}
+
+
+static void
+scale_block (guint n,
+ double *a,
+ double factor)
+{
+ while (n--)
+ a[n] *= factor;
+}
+
static double
diff (guint m,
guint p,
@@ -296,7 +418,7 @@ reference_bitreverse_fft2synthesis (const unsigned int n,
unsigned int i, r;
double scale = n;
- scale = 1.0 / scale;
+ scale = 1; /* set to 1.0 / scale to get scaled synthesis */
BUTTERFLY_10scale (X[0], X[1],
X[n], X[n + 1],
Y[0], Y[1],
@@ -459,3 +581,36 @@ reference_power2_fftc (unsigned int n_values,
}
while (block_size <= n_values);
}
+
+/*--------------- reference DFT -----------------*/
+
+static BseComplex
+complex_exp (BseComplex z)
+{
+ /* also found in g++-4.2 C++ complex numbers */
+ return bse_complex_polar (exp(z.re), z.im);
+}
+
+void
+reference_dftc (unsigned int n_values,
+ const double *rivalues_in,
+ double *rivalues_out)
+{
+ /* http://en.wikipedia.org/wiki/Discrete_Fourier_transform says:
+ *
+ * out[k] = SUM{n=0..N-1} (in[n] * exp (-2 * pi * j / N * k * n))
+ */
+ guint k, n;
+ for (k = 0; k < n_values; k++)
+ {
+ BseComplex result = { 0, 0 };
+
+ for (n = 0; n < n_values; n++)
+ result = bse_complex_add (result,
+ bse_complex_mul (bse_complex (rivalues_in[n * 2], rivalues_in[n * 2 + 1]),
+ complex_exp (bse_complex (0, -2 * PI / n_values * ((k * n) % n_values)))));
+
+ rivalues_out[k * 2] = result.re;
+ rivalues_out[k * 2 + 1] = result.im;
+ }
+}
--
1.7.1
From 757128a51ed3f76289a6402808db513aac71ef05 Mon Sep 17 00:00:00 2001
From: Tim Janik <timj gtk org>
Date: Mon, 16 May 2011 22:51:15 +0200
Subject: [PATCH 4/4] BSE: fft docu fixes
---
bse/gslfft.h | 21 +++++++++------------
1 files changed, 9 insertions(+), 12 deletions(-)
diff --git a/bse/gslfft.h b/bse/gslfft.h
index 5ba154b..766f4eb 100644
--- a/bse/gslfft.h
+++ b/bse/gslfft.h
@@ -87,10 +87,9 @@ void gsl_power2_fftsc (const unsigned int n_values,
double *ri_values_out);
/**
- * gsl_power2_fftsc_scale
- * @n_values: Number of complex values
- * @ri_values_in: Complex frequency values [0..n_values*2-1]
- * @ri_values_out: Complex sample values [0..n_values*2-1]
+ * @param n_values Number of complex values
+ * @param ri_values_in Complex frequency values [0..n_values*2-1]
+ * @param ri_values_out Complex sample values [0..n_values*2-1]
* This function performs a decimation in time fourier transformation
* in backwards direction with normalization. As such, this function
* represents the counterpart to gsl_power2_fftac(), that is, a value
@@ -111,10 +110,9 @@ void gsl_power2_fftsc_scale (const unsigned int n_values,
const double *ri_values_in,
double *ri_values_out);
/**
- * gsl_power2_fftar
- * @n_values: Number of real sample values
- * @r_values_in: Real sample values [0..n_values-1]
- * @ri_values_out: Complex frequency values [0..n_values-1]
+ * @param n_values Number of real sample values
+ * @param r_values_in Real sample values [0..n_values-1]
+ * @param ri_values_out Complex frequency values [0..n_values-1]
* Real valued variant of gsl_power2_fftac(), the input array contains
* real valued equidistant sampled data [0..n_values-1], and the output
* array contains the positive frequency half of the complex valued
@@ -165,10 +163,9 @@ void gsl_power2_fftsr (const unsigned int n_values,
double *r_values_out);
/**
- * gsl_power2_fftsr_scale
- * @n_values: Number of real sample values
- * @ri_values_in: Complex frequency values [0..n_values-1]
- * @r_values_out: Real sample values [0..n_values-1]
+ * @param n_values Number of real sample values
+ * @param ri_values_in Complex frequency values [0..n_values-1]
+ * @param r_values_out Real sample values [0..n_values-1]
* Real valued variant of gsl_power2_fftsc(), counterpart to
* gsl_power2_fftar(), using the same frequency storage format.
* A real valued data set transformed into the frequency domain
--
1.7.1
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]