*From*: "Uri David Akavia" <uridavid akavia gmail com>*To*: "Leonard Mada" <discoleo gmx net>*Cc*: dev sc openoffice org, gnumeric-list gnome org*Subject*: Re: Median: Oasis and Fast Sorting Algorithm*Date*: Sat, 10 Feb 2007 11:48:41 +0200

Admittedly, my complexity course is behind me, but there seem to be a few things wrong with your calculations of the efficiency. I'll try to point out what I didn't like. If I'm wrong, please correct me. On 2/10/07, Leonard Mada <discoleo gmx net> wrote:

I have imagined a different algorithm, too, although I never tested it. I will briefly describe the case with 2*n elements (2n+1 should be similar):

First, if you're using 2*n elements, shouldn't your results be nlog(n) instead of n/2*log(n)? Second, if you're going to use O notation, O already indicates the worst scenario, and basically O(n/2 * log(n) -1 + log[(n/2 -1)!]) can be reduced to O(nlog(n)), which is the same O as ordinary sort. It might have better Omega or Theta, which I think is what you meant. Third, you seem to have forgotten some actions when calculating - I realize that you're allowed to ignore some operations, but I'm not sure about these (again, please correct me if I'm wrong) - there are two comparisons for every item in the second half of the array, therefore you need to add 2*n actions. In step 7, you mention that adding an additional value should take O(log(n)). I believe that figuring out where to place this value would take O(log(n)). If in addition you need to actually place this value in the array, you need to shift n/2 items in the worst case scenario. This means that you have a decreasing series, going from n/2 to 1 - the sum of this series is (n/2 + 1)*n/4 or O(n^2). According to my calculations, this means that figuring the median of 2*n items would take n*log(n) + log(n!) + (n/2 + 1)*n/4, or O(n^2). This would seem to be slower than your calculation, and has a larger O notation then ordinary sort. In fact, it would seem to be effectively slower then 2n*log(2n). As I said, I might be mistaken in my calculations, since it has been a while since I actually learned how to do them. If I am, please correct me. However, O notation is not a practical representation, since it assumes that all actions are equal. I'm not familiar enough with what actions are costlier in C, and you might have to take into account Gnumeric's internal representation. As one coding book mentioned "Pillage, then burn" - first measure actions and figure out what to optimize, then do it. You'd have to benchmark the current C code, and your proposed C code to figure out the difference. So that's one point. While I'm not sure about revising the algorithm, you might be right about the definition - define it as "MEDIAN will return the value that WOULD be in the middle IF sorted" and then you won't force anyone to sort. I still think most people will, since median discovery by sorting is fast enough for most practical purposes, and it is much simpler (hence, but free) then more complicated code. Yours, Uri David Akavia

**Follow-Ups**:**Re: Median: Oasis and Fast Sorting Algorithm***From:*Leonard Mada

**Re: Median: Oasis and Fast Sorting Algorithm***From:*Andreas J. Guelzow

**Re: [sc-dev] Re: Median: Oasis and Fast Sorting Algorithm***From:*Niklas Nebel

**References**:**Median: Oasis and Fast Sorting Algorithm***From:*Leonard Mada

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]