Re: group data statistical functions



Having just retired from teaching such things for over 3 decades, I must admit feeling very tired from trying to get the "formulas" purged from textbooks.

The problem is that the grouped data can have descriptive statistics (and order statistics too) that are rather poor approximations to the actual values from raw data. Thus it is quite important that the user really does decide which approximation should be used. This is even more the case when the ranges of the bins are not equal --- and published statistics are VERY bad this way.

The data presented are integers, so the grouped and raw data should produce the same results, but we don't know a priori whether data supposedly at "2" is from integers or numbers anywhere in
(1.5, 2.5] or [1.5, 2.5) or [1,2) even.

A grouped descriptive statistic almost needs a page of documentation per number. The exam results on any question I gave to otherwise smart students on this topic were always <50%. It's not rocket science here. Once you realize what is going on, one can quickly figure out whether it is worth doing.

Sorry if this seems a bit negative, but the dangers are rather like providing working flying controls for the kid in the back of the 777.

Offline I can provide ways to do it pretty quickly.

JN






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]