Re: [tim-janik/beast] BSE: bsemathsignal: add approximations: Bse::fast_log2 and Bse::fast_exp2 (#124)



As mentioned on IRC, enabling optimizations with MODE=release and picking clang++ (6.0 here) vs g++ (7.4 here) makes major differences when benchmarking exp2f and log2f from glibc against our approximations. On a modern AMD64 processor, glibc is often faster. Internally it also uses polynomials around order 4, but picks its coefficients from a table depending on the input argument. With that it achieves errors < 1 ULP and is often speedier because it can also use hand crafted SSE2 implementations.
I haven't had a chance to benchmark the approximations on musl, but so far, based on your submission, I'm inclined to integrate the following:

  1. Rename bse_approx6_exp2 to fast_exp2() and get rid of the other approximation variants.
  2. Add fast_log2() based on your 6th order version, but with error correction for integer logarithms.
  3. When building for AMD64, use exp2f to implement fast_exp2 and use log2f to implement fast_log2.

Here's the error correction I'm talking about, note that exchanging "long double" for "float" makes the code significantly slower, because it forces the compiler to add code to reduce precision. On my machine, this version is roughly as fast as log2f when compiling with optimizations, with both compilers:

static inline long double G_GNUC_CONST
fast_log2f (float value)
{
  union {
    float f;
    int i;
  } float_u;
  float_u.f = value;
  // compute log_2 using float exponent
  const int log_2 = ((float_u.i >> 23) & 255) - 128;
  // replace float exponent
  float_u.i &= ~(255 << 23);
  float_u.i += BSE_FLOAT_BIAS << 23;
  long double u, x = float_u.f;
  // lolremez --long-double -d 6 -r 1:2 "log(x)/log(2)+1-0.00000184568668708"
  u =         -2.5691088815846393966e-2l;
  u = u * x +  2.7514877034856806734e-1l;
  u = u * x + -1.2669182593669424748l;
  u = u * x +  3.2865287704176774059l;
  u = u * x + -5.3419892025067624343l;
  u = u * x +  6.1129631283200211528l;
  x = u * x + -2.040042118396715321l;
  return x + log_2;
}

Error samples, compared to LOG2L(3):

   +0.0, -0.00000231613294631
   +0.5, +0.00000000000000000
   +1.0, +0.00000000000000000
   +1.1, -0.00000181973000285
   +1.5, -0.00000130387210186
   +1.8, -0.00000312228549678
   +2.0, +0.00000000000000000
   +2.2, -0.00000181973000285
   +2.5, -0.00000140048214306
   +3.0, -0.00000130387210186
   +4.0, +0.00000000000000000
   +5.0, -0.00000140048214306
   +6.0, -0.00000130387210186
   +7.0, -0.00000312228549678
   +8.0, +0.00000000000000000
   +9.0, -0.00000084878575295
  +10.0, -0.00000140048214306
  +11.0, -0.00000368176020430
  +16.0, +0.00000000000000000
  +32.0, +0.00000000000000000
  +40.0, -0.00000140048214306
  +48.0, -0.00000130387210186
  +54.0, -0.00000149844406951
  +64.0, +0.00000000000000000
 +127.0, -0.00000162654178981
 +128.0, +0.00000000000000000


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]