[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: Median: Oasis and Fast Sorting Algorithm
- From: "Andreas J. Guelzow" <aguelzow taliesin ca>
- To: gnumeric-list gnome org
- Subject: Re: Median: Oasis and Fast Sorting Algorithm
- Date: Sat, 10 Feb 2007 10:15:57 -0700
On Sat, 2007-10-02 at 02:41 +0200, Leonard Mada wrote:
> John Machin wrote:
> > ...
> > So who cares? The median value is 1. Is your alternative going to
> > return some value other than 1 ????
>
> Please define mathematically the middle value! It is NOT trivial as my
> definitions showed. Anything else would be ambiguous. This should be a
> standard, so make a better definition.
Contrary to your claims, there is nothing ambiguous. Any non-decreasing
list of the same values has the same middle value(s).
>
> Well, I could have used a much shorter definition: the median is the
> value that halves the list so that there are two sets of equal size with
> numbers in the first set being higher than the median and numbers in the
> second set being lower. As noted, this definition avoids the sorting,
> too. (One could extend this definition for even and odd number of
> elements. Or even a much shorter definition: the 50th percentile. BUT
> all these definitions are ambiguous, see later.)
>
> The one thing that I do NOT agree at all with the OASIS definition is,
> that it includes the wording "sorting". Sorting is definitely NOT
> necessary to calculate the median. You can take any array, even one that
> is NOT sorted, and determine the median without first sorting it. This
> is much to often stated wrongly in so many textbooks, BUT sorting is
> really not necessary.
The OpenFormula standard does not prescribe any method used to find the
value. It only prescribes what the value is.
>
> So, this is NOT a prerequisite that should enter a standard definition.
>
> May I even point out, that for even number of elements, one may
> define/have an upper median and a lower median. Alternatively, in
> serious mathematical uses, the median is usually calculated using a
> weighted approach. Therefore, the median of 1,2,2,3,4,5 is NOT (2+3)/2 =
> 2.5, BUT rather (2+2+3)/3 = 2.66. So, it does make sense to have a very
> strong and unambiguous definition in a standard.
> The *weighted median* may be introduced later into the standard and then
> the ambiguity would be complete.
MEDIAN is not intended to implement a weighted median. None of the
current spreadsheet implementation uses that name for a weighted
median.
Gnumeric for example does also provide a function for a weighted median,
namely SSMEDIAN. That function may at some time also be introduced in
the Standard but would in no way make other definition ambiguous.
Andreas
--
Prof. Dr. Andreas J. Guelzow
Dept. of Mathematical & Computing Sciences
Concordia University College of Alberta
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]