Re: reftests



As an update: http://blogs.gnome.org/otte/2011/05/05/reftests/ has a
tutorial for writing reftests. I put it in my blog as it's nicer to
layout things there, I didn't want to send large GIF attachments via
email and it's reasonably easy for me to show it to anyone in the
future if I want them to write a reftest.

Benjamin


On Tue, May 3, 2011 at 10:01 PM, Benjamin Otte <otte gnome org> wrote:
> Hey,
>
> with the latest commits[1] I have added reftests to GTK. Reftests are
> my approach at getting layout and rendering behavior of gtk tested.
> I've added a bunch of tests already for the things I have fixed and
> will continue to add tests for bugs I fix. For what the test runner
> does, see the commit message in [1], for what reftests are, see [2].
> The test runner works very well, even though it is still a bit rough
> around the edges, but that's mostly because gtester needs to be made
> better to cope with generic testing. (It's way too crash-happy as-is.)
>
> In this mail, I want to go into the motivation for writing reftests
> and why I didn't want to make use of the previous test infrastructure.
> I tried to achieve the following goals (if you think I could achieve
> them better, please speak up):
> - It should be easy to create tests
> - It should be easy to run tests
> - It should be easy to understand tests
> - It should be easy to fix problems shown by tests
> - The test infrastructure should easily scale
>
> That's the TL;DR version, here is the long one:
>
> - It should be easy to create tests
> Writing a test is something people hate to do. It's the #1 reason why
> Open Source projects don't write tests. Also, it's the #1 reason why
> bugs aren't fixed. If people would file bugs with easy to reproduce
> tests instead of saying "in my custom application, when I do X, Y
> happens and not Z", there'd be a much higher chance developers would
> be interested in looking at it.
> This is why the reftests use stock ui files that can be created in
> Glade. So everyone that is able to use Glade can create a test file.
> And we can just use it.
>
> - It should be easy to run tests
> It's quite hard to get someone to run a test. It requires compilation
> of a GTK checkout. That is not good.
> For a developer, too, it's quite complicated to run a test from
> someone else, say from bugzilla or a pastebin. Either you have to
> invoke gcc manually or you have to integrate it into the testsuite
> infrastructure.
> With reftests, you dump the ui file somewhere and run
> tests/reftests/gtk-reftest path/to/file.ui and that's it. You can then
> spend the rest of the day updating the testcase wherever you want, and
> pastebin or mail it back and forth with whoever you work on the test
> together.
>
> - It should be easy to understand tests
> Here's an example output from the current testsuite:
>  /FilterModel/filled/hide-root-level:
>  ** ERROR **: Signal queue empty
>  aborting...
> It's hard to understand what might be broken. The output from current
> tests is both sparse and not very informative. If somebody came into
> IRC and said he ran make check and got this, I doubt anybody would
> know how to fix it. Or be interested in actually fixing what is wrong.
> So it is important that tests provide output that is easy to digest
> and get a hunch of what is actually wrong. Which is why gtk-reftest
> outputs images - the reference rendering of the expected output[3],
> the actual rendering[4] and the difference between those[5]. And it
> should be reasonably easy to find the difference between them and get
> an idea of what is wrong (Pango doesn't ellipsize every row, only the
> last one. Bad Pango - and Behdad hasn't even applied my patch for
> this, I need to poke him again as I've just committed that test,
> ooops.)
>
> - It should be easy to fix problems shown by tests
> This is really a combination of the previous points, but deserves
> separate mention: If a test regresses in a year or so and the original
> author has left to work on Libreoffice, Mozilla or other exciting
> jobs, it should be easy for the current developer to fix the problem.
>
> - The test infrastructure should easily scale
> This is mostly a question about how to organize a test suite so that
> people actually run it. Or at least run the parts that are relevant to
> them and an automatic testing infrastructure can do the full run and
> actually produce useful output to developers of something fails. So
> far we're pretty bad at this. Our patented test runner named Dan
> Winship interacts with the developers by reopening bugs with a bit of
> output from stderr. That works for now, but I'm not sure that test
> runner wants to scale.
> To give everyone a clue for what I'm aiming at:
> * The Swfdec testsuite contains 2.500+ tests. It takes 3 minutes to run.
> * The cairo testsuite contains 350 tests. It takes about 10 minutes to
> run for a normal run. A full run easily takes an hour.
> * The Webkit testsuite contains 20.000+ tests. It takes 15-20 minutes
> to run them all.
> So from looking at those numbers (and I didn't include Mozilla because
> I couldn't find any numbers - but they would be frightening) I would
> guess that a "proper" GTK testsuite should contain 10.000+ tests and a
> full run would take at least 10 minutes. And in there, it should be
> easy to identify tests, run some of them and generate useful outputs.
> In particular, it should be easy to skip it.
>
> So, this got longer than I expected it to get. So I better close now. Questions?
>
> Benjamin
>
> PS: Credit for this test runner goes to David Baron, Robert
> O'Callahan, Carl Worth, Sandro Santilli who inspired me to spend more
> time on testing and actually like it.
>
>
> 1: http://git.gnome.org/browse/gtk+/commit/?id=363dbb60397ebf683d8a97ae15517030c27357d7
> 2: http://weblogs.mozillazine.org/roc/archives/2008/12/reftests.html
> 3: http://people.freedesktop.org/~company/stuff/label-fun.ref.png
> 4: http://people.freedesktop.org/~company/stuff/label-fun.out.png
> 5: http://people.freedesktop.org/~company/stuff/label-fun.diff.png
>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]