Aw: Re: Re: Re: deco-Math project, step 00_a: exact bin and dec 'ranges' (in gnumeric).




hello @John Denker,

> There are issues with floating point.

agree,
 
> AFAICT the real topic here is not a gnumeric issue,

let's concentrate on what gnumeric can! do:

- 'rounddown' and 'roundup' sometimes producing wrong results.

AFAICT IEEE , 'C' or 'C++' don't let you round to 'places', but only offer 'round', 'floor' and 'ceil' rounding to integers. thus providing rounding (-up, -down or simple 'round') to specified amounts of digits right or left of the decimal point has to be coded in the application program which is making use of IEEE or 'C'. it's thus an algorithm implemented in the application. if the results are meaningful: ok, if not: an issue of the program, in this case of gnumeric. 

IMHO gnumeric uses "rounddown(x,n) -> gnumeric_trunc(x,n) -> gnm_fake_trunc(x*10^n) -> go_fake_trunc(x*10^n) -> floor(go_d2d(x*10^n) | floor(go_add_epsilon(go_d2d(x*10^n)))" to calculate a rounddown, and - from a human or mathematical POV - fails for e.g. '=rounddown(0,24999999999999997;16)', '=rounddown(562949953421312,9)' and some other cases.
this can have several reasons:
-intentional, 'Excel compatibility'? -> then it would make sense if:
a.: this 'intentional error' would somehow be communicated to the user (instead of letting her/him either calculate nonsense or having to ask stupid questions in forums), and
b.: gnumeric would offer a alternative calculating correct results, and
c.: gnumeric would copy Excels behaviour completely, rounding the ibput to 15 digits as well,
[be aware that Excel performs 'input rounding' to 15 significant decimal digits, thus in Excel '=rounddown(0,24999999999999997;16)' would change to '=rounddown(0,250000000000000;16)', with that a result of 0,25000~ for rounddown is consistent to the input, gnumeric calculating the same result but keeping the precise input value shows something inconsistent.
- unintentional, error in the algorithm? -> then it should be improved,
- unintentional, errors in other functions (floor, int or similar)? - then these should be found and improved,
- other ... which ones?
with my modest knowledge i can hardly do more than:
- point out the irritations, - that's done with this,
- try if one can achieve better results with some methods? - yes, you could, for values above 2^49 '=x-mod(x;1)' gives better results than floor(x).
the decision if and what to change in gnumeric and to properly put it in code i have to leave to people with more experience.

similar applies for the calculation which binary or decimal range a value is in.
it is sometimes important / necessary to know this, and usually '=int(log2(abs(x)))' or similar is recommended. However, this fails for a few values that are one or a few ULP below an exact power of two.
In my opinion it is useful / important that an algorithm delivers correct results for all! input values, therefore I have:
- pointed out the problem with examples,
- showed with a code example that more correct results are possible (this is not extensively tested but a 'proof of concept'),
a suitable implementation i have to leave to people with more experience,

i would like to have 'bin_range' and 'dec_range' available to try out and suggest further improvements, the programming would become clearer. and i would like to learn how to implement changes 'suitable' in gnumeric.

(if you like to let me with my cruel tries a hint how to extract the exponent form a fp figure would be helpful).

> The IEEE floating point standard was very carefully
designed. Any attempt to do better would require many years of
skilled, highly specialized effort.

i am not! talking about improving IEEE (maybe later), and not about 'posits' (cruel idea), i'm talking about what gnumeric calculates with use of IEEE functionalities, and IMHO such fits into this platform. 
 
> there is uncertainty in the raw data ...

that's expanding the issue(s) far beyond my questions. i'm talking about two small - almost atomic - fails which i consider having evil effect of spreading imprecision. not about handling erroneous input data streams.

> 'Monte Carlo plugin'?

maybe later, for the moment i consider it overpowered to fight simple clear miscalculations with randomness,

> One thing that does *not* work is worrying about rounding modes.

sorry for objecting, it does! ... matter ... . already Prof. Kahan cited something with 'accepting rounding errors puts you in a state of sin' or similar, he didn't? but c/should have mentioned: 'rounding is a weak weapon against fp-imprecision, but it's the best thing we have, it's the only one'.

> Another thing that generally does not work is "interval arithmetic".
That is, you could represent each number by an ordered pair, namely
a strict lower bound and a strict upper bound. It's not clear, but
I suspect that's what "ranges" are trying to do.

no, that's completely misunderstood. interval arithmetic might work to some extend, providing results with bounded errors? but that's absolutely not the 'ranges' i'm talking about. my target is to have a clear decision in which binary or decimal 'exponent range' a value is, e.g. "[1 .. 10[" ('10[' saying 'til 10 but 10 excluded') is decimal range '0' (the exponent in scientific notation is '0'), [4 .. 8[ is binary range '2' (absolute of value is >= 2^2 and below 2^3, the exponent in normalized IEEE notation is '2').

regards, tia for any further help ...



b.

---
 
Gesendet: Dienstag, 06. Juli 2021 um 04:39 Uhr
Von: "John Denker via gnumeric-list" <gnumeric-list gnome org>
An: gnumeric-list gnome org
Betreff: Re: Aw: Re: Re: deco-Math project, step 00_a: exact bin and dec 'ranges' (in gnumeric).
This thread is very confused.

For starters, AFAICT the real topic here is not a gnumeric issue,
but rather a floating-point issue, and perhaps an algorithm issue.

There are issues with floating point. Always have been. Always
will be. The IEEE floating point standard was very carefully
designed. Any attempt to do better would require many years of
skilled, highly specialized effort. This is not the proper forum
for that.

There is considerable accumulated expertise in dealing with the
roundoff errors inherent in any floating point representation.
This is an algorithm design issue. In virtually all applications,
floating point roundoff is not the only source of uncertainty.
In nearly all real-world applications, including science and
engineering, there is uncertainty in the raw data. Also, if you
are doing any sort of modeling, there are imperfections in the
model. For example, if the model involves a power series, there
will be series truncation errors. Floating point imperfections
are part (but only part) of the mix. There are fat books on how
to deal with this.

////////////

One particularly powerful method is Monte Carlo. That has been
around since the 1940s, which is rather a long time in the
computer business.

Simple Monte Carlo calculations can already be done using
spreadsheets. I've done thousands of them.

Commercial vendors sell plugins that facilitate complicated
Monte Carlo calculations for excel. If you want to do something
useful, you could write a similar plugin for gnumeric.

////////////

Another method is to just use integers. For example, in a financial
calculation, represent everything as an integral number of cents.
At the very last step, use integer_divide and integer_modulo to
format the result in dollars in the conventional way.

In particular, use the FPU. That is, store the integers in what
C calls a "double" ... which is what gnumeric already does. That
can represent integers exactly, over a rather wide range.

If you want, you can enable all the FPU exceptions, including
FE_INEXACT, to give you confidence that nothing bad is happening
behind your back. For example, any attempt to represent 0.1 will
throw the exception.

Beware that some library functions don't behave as expected. For
example, on my machine, sqrt(2.) does not throw the FE_INEXACT
exception.

/////////////

One thing that does *not* work is worrying about rounding modes.
That is strictly amateur hour. If the difference between
rounding_up / rounding_down / rounding_to_even is significant,
the battle is already over and you lost. Start over with a more
robust algorithm.

//////////////

Another thing that generally does not work is "interval arithmetic".
That is, you could represent each number by an ordered pair, namely
a strict lower bound and a strict upper bound. It's not clear, but
I suspect that's what "ranges" are trying to do. The problem is that
these are worst-case bounds. For typical algorithms, the worst case
is verrrry much worse than the typical case, impractically so. There
may be special mathematical situations where interval arithmetic is
usable, but in the other 99.999% of the situations you're vastly
better off with Monte Carlo ... and even if you could use interval
arithmetic you'd be better off with integers.

«Die ganzen Zahlen hat der liebe Gott gemacht;
alles andere ist Menschenwerk.»
— Kronecker
_______________________________________________
gnumeric-list mailing list
gnumeric-list gnome org
https://mail.gnome.org/mailman/listinfo/gnumeric-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]