*From*: Leonard Mada <discoleo gmx net>*To*: gnumeric-list gnome org*Subject*: Re: Median: Oasis and Fast Sorting Algorithm*Date*: Sat, 10 Feb 2007 19:35:03 +0200

Lets take the array 1,1,2,3,3,3

Sincerely Yours, Leonard Mada Andreas J. Guelzow wrote:

On Sat, 2007-10-02 at 02:41 +0200, Leonard Mada wrote:John Machin wrote:...So who cares? The median value is 1. Is your alternative going toreturn some value other than 1 ????Please define mathematically the middle value! It is NOT trivial as mydefinitions showed. Anything else would be ambiguous. This should be astandard, so make a better definition.Contrary to your claims, there is nothing ambiguous. Any non-decreasinglist of the same values has the same middle value(s).Well, I could have used a much shorter definition: the median is thevalue that halves the list so that there are two sets of equal size withnumbers in the first set being higher than the median and numbers in thesecond set being lower. As noted, this definition avoids the sorting,too. (One could extend this definition for even and odd number ofelements. Or even a much shorter definition: the 50th percentile. BUTall these definitions are ambiguous, see later.)The one thing that I do NOT agree at all with the OASIS definition is,that it includes the wording "sorting". Sorting is definitely NOTnecessary to calculate the median. You can take any array, even one thatis NOT sorted, and determine the median without first sorting it. Thisis much to often stated wrongly in so many textbooks, BUT sorting isreally not necessary.The OpenFormula standard does not prescribe any method used to find the value. It only prescribes what the value is.So, this is NOT a prerequisite that should enter a standard definition.May I even point out, that for even number of elements, one maydefine/have an upper median and a lower median. Alternatively, inserious mathematical uses, the median is usually calculated using aweighted approach. Therefore, the median of 1,2,2,3,4,5 is NOT (2+3)/2 =2.5, BUT rather (2+2+3)/3 = 2.66. So, it does make sense to have a verystrong and unambiguous definition in a standard.The *weighted median* may be introduced later into the standard and thenthe ambiguity would be complete.MEDIAN is not intended to implement a weighted median. None of the current spreadsheet implementation uses that name for a weightedmedian.Gnumeric for example does also provide a function for a weighted median, namely SSMEDIAN. That function may at some time also be introduced in the Standard but would in no way make other definition ambiguous. Andreas

**Follow-Ups**:**Re: Median: Oasis and Fast Sorting Algorithm***From:*Andreas J. Guelzow

**References**:**Median: Oasis and Fast Sorting Algorithm***From:*Leonard Mada

**Re: Median: Oasis and Fast Sorting Algorithm***From:*Andreas J. Guelzow

**Re: Median: Oasis and Fast Sorting Algorithm***From:*Leonard Mada

**Re: Median: Oasis and Fast Sorting Algorithm***From:*John Machin

**Re: Median: Oasis and Fast Sorting Algorithm***From:*Leonard Mada

**Re: Median: Oasis and Fast Sorting Algorithm***From:*Andreas J. Guelzow

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]