On Wed, 16 Aug 2006 01:25:40 +0200, "Manu Cornet" wrote: > * We were missing an important bit of information, the number of iterations. > > So I changed the output to look like this: > > # 3 x 84 iterations -- Full torture > # Widget Boot-create Boot-map Boot-expose That's not quite what I'm getting. I get 0 for the number of iterations, (looks like n_iter is being printed before it is set in torture_widget). I do find the "3 x" somewhat confusing. Is that just with reference to the three separate runs, (what I named "Boot", "Expose" and "Resize")? It might give the appearance that reported times should be divided by 252 instead of 84. Also, the program currently has an arbitrary multiplier of 12 between the input "torture level" and the number of iterations performed. This is a barrier for someone trying to replicate an experiment after reading a report. It would be much easier if the multiplier were dropped and the user just supplied the number of iterations directly. > If your postprocessing script is set to ignore #-commented lines, this > should still work fine. I've also added a blank line before each > output (each single widget test, or each ful torture) so that it is > easier to read, I think this shouldn't be a problem? Heh. My postprocessing script definitely got confused by the blank line. But no matter... it shouldn't be hard for me to fix the next time I touch this. > And what do you think about displaying the average time values instead > of the total time values, so that one could directly compare results > obtained with different numbers of iterations? That sounds quite reasonable. > Indeed, very large buttons or very small scrollable areas aren't > really interesting nor common (although I don't think the size has a > very big impact on the boot time), but I think this tool will be used > mainly to compare different engines. And I would tend to say that if > two engines have exactly the same speed in common situations, but one > of them is written well enough to handle more general situations > better and faster, well, it seems fair to give it better "marks" :-) > Does this make sense? The important question comes down to: Can someone use the results of this tool to accurately predict the performance of one theme compared to another, (or in my case one version of GTK compared to another), when applied to real applications rather than the test cases. And the answer to that depends on how accurately the test cases match what the real applications do. I'm not making a direct claim that the results are not valid. It just seemed like something that a real application is likely to never do. -Carl
Attachment:
pgp4GSH2KNAah.pgp
Description: PGP signature