Re: Make'em suffer!



On Wed, 16 Aug 2006 01:25:40 +0200, "Manu Cornet" wrote:
> * We were missing an important bit of information, the number of iterations.
>
> So I changed the output to look like this:
>
> # 3 x 84 iterations -- Full torture
> # Widget        Boot-create     Boot-map        Boot-expose

That's not quite what I'm getting. I get 0 for the number of
iterations, (looks like n_iter is being printed before it is set in
torture_widget).

I do find the "3 x" somewhat confusing. Is that just with reference to
the three separate runs, (what I named "Boot", "Expose" and "Resize")?
It might give the appearance that reported times should be divided by
252 instead of 84.

Also, the program currently has an arbitrary multiplier of 12 between
the input "torture level" and the number of iterations performed. This
is a barrier for someone trying to replicate an experiment after
reading a report. It would be much easier if the multiplier were
dropped and the user just supplied the number of iterations directly.

> If your postprocessing script is set to ignore #-commented lines, this
> should still work fine. I've also added a blank line before each
> output (each single widget test, or each ful torture) so that it is
> easier to read, I think this shouldn't be a problem?

Heh. My postprocessing script definitely got confused by the blank
line. But no matter... it shouldn't be hard for me to fix the next
time I touch this.

> And what do you think about displaying the average time values instead
> of the total time values, so that one could directly compare results
> obtained with different numbers of iterations?

That sounds quite reasonable.

> Indeed, very large buttons or very small scrollable areas aren't
> really interesting nor common (although I don't think the size has a
> very big impact on the boot time), but I think this tool will be used
> mainly to compare different engines. And I would tend to say that if
> two engines have exactly the same speed in common situations, but one
> of them is written well enough to handle more general situations
> better and faster, well, it seems fair to give it better "marks" :-)
> Does this make sense?

The important question comes down to: Can someone use the results of
this tool to accurately predict the performance of one theme compared
to another, (or in my case one version of GTK compared to another),
when applied to real applications rather than the test cases.

And the answer to that depends on how accurately the test cases match
what the real applications do.

I'm not making a direct claim that the results are not valid. It just
seemed like something that a real application is likely to never do.

-Carl

Attachment: pgp4GSH2KNAah.pgp
Description: PGP signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]