First of all, this adds a function that was missing in bsemathsignal: a
fast_log2
implementation. It works by using the float exponent and approximating the rest of the value using a polynomial. I found a tool that gives optimal polynomial coefficients for a given order: https://github.com/samhocevar/lolremez
Note that your code adds a needless cast that costs time and precision, the formulas all contain terms like:
u = u * x + T (constant); // T ∈ { float, double, long double }
For e.g. T=float, this costs precision. X86 FPUs load the constants into one of the internal FPU registers, the internal FPU registers are 80bit wide, casting the constant to float before hand truncates some of the last digits without making the following FPU internal operations any faster.
A similar precision loss can be observed with AMD64, example:
fast_log2<6, double> (+1.5)
0.00000054181458465
fast_log2<6, float> (+1.5)
0.00000153622037214
I.e. I'd strongly recommend to remove that cast.
Also, Wikipedia suggests that similarly to exp2(), a log2 approximation can be split into a quick integer part approximation and a severely constrained approximation for the fractional part that only affects the intervall [1,2): https://en.wikipedia.org/wiki/Binary_logarithm#Iterative_approximation
So I'd be interested to see the Remez approximation optimized for only [1,2) (with an error of 0 at [1], this can be achieved by subtracting a constant). So combined with adding the integer part, the approximated function still matches log2() at integer points exactly.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.