Query regarding hypergeometric function



This is a query about what "should happen" when a function is given arguments that are in some sense out of range.

The hypergeometric distribution function computes the probability that x "successes" are observed in a sample of n trials from a finite population of N elements where M of the N are "successes. We sample WITHOUT replacement. Example: 10 people in a village of 100 have math anxiety. We sample 5 of them and clearly don't check the same people twice ('without replacement'). Hypergeometric distribution gives the probabilities we get 0, 1, 2, 3, 4 or 5 with MA in our sample.

Now suppose we have nobody with the disease. Clearly P(0, 5, 0, 100) =1 and the rest should be 0. Gnumeric reports for "=hypgeomdist(1,5,0,100)" the result

#NUM!

which has some merit, since we are doing something that is "impossible" (getting 1 out of none).

I can see reporting an error for

=hypgeomdist(1,15,0,10)

i.e., sample bigger than population. But I know I'd rather get 0 for the values that are "impossible" in a feasible sample.

There are several of these borderline cases throughout the functions. It would be nice to document them and then come up with a good set of choices.

Comments welcome.

JN




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]