Re: Gtk+ unit tests (brainstorming)



El mié, 25-10-2006 a las 17:52 +0200, Tim Janik escribió:

> - Unit tests should run fast - a test taking 1/10th of a second is a slow
>    unit test, i've mentioned this in my blog entry already.

Sure, very important, or otherwise developers will tend to neither use
nor maintain the tests.

> - in the common case, test results should be reduced to a single boolean:
>      "all tests passed" vs. "at least one test failed"
>    many test frameworks provide means to count and report failing tests
>    (even automake's standard check:-rule), there's little to no merit to
>    this functionality though.
>    having/letting more than one test fail and to continue work in an
>    unrelated area rapidly leads to confusion about which tests are
>    supposed to work and which aren't, especially in multi-contributor setups.
>    figuring whether the right test passed, suddenly requires scanning of
>    the test logs and remembering the last count of tests that may validly
>    fail. this defeats the purpose using a single quick make check run to
>    be confident that one's changes didn't introduce breakage.
>    as a result, the whole test harness should always either succeed or
>    be immediately fixed.

I understand your point, however I still think that being able to get a
wider report with all the tests failing at a given moment is also
interesting (for example in a buildbot continuous integration loop, like
the one being prepared by the build-brigade). Besides, if there is a
group of people that want to work on fixing bugs at the same time, they
would need to get a list of tests failing, not only the first one.

> - for reasons also mentioned in the afformentioned blog entry it might
>    be a good idea for Gtk+ as well to split up tests into things that
>    can quickly be checked, thoroughly be checked but take long, and into
>    performance/benchmark tests.
>    these can be executed by make targets check, slowcheck and perf
>    respectively.

Yes, seems an excellent idea to me.

> - homogeneous or consistent test output might be desirable in some contexts.

Yes, it is an important point when thinking about a continuous
integration tool for Gnome. If tests for all modules in Gnome agree on a
common output format, then that data can be collected, processed and
presented by a continuous integration tool like buildbot and would make
it easy to do cool things with those test results. In the build-brigade
we had also talked a bit on this subject.

>    so far, i've made the experience that for simple make check runs, the most
>    important things are that it's fast enough for people to run frequently
>    and that it succeeds.
>    if somewhat slowly perceived parts are hard to avoid, a progress indicator
>    can help a lot to overcome the required waiting time. so, here the exact
>    oputput isn't too important as long as some progress is displayed.

Yes, good point. 

> - GLib based test programs should never produce a "CRITICAL **:" or
>    "WARNING **:" message and succeed. the reasoning here is that CRITICALs
>    and WARNINGs are indicators for an invalid program or library state,
>    anything can follow from this.
>    since tests are in place to verify correct implementation/operation, an
>    invalid program state should never be reached. as a consequence, all tests
>    should upon initialization make CRITICALs and WARNINGs fatal (as if
>    --g-fatal-warnings was given).

Maybe you would like to test how the library handles invalid input. For
example, let's say we have a function that accepts a pointer as
parameter, I think it is worth knowing if that function handles safely
the case when that pointer is NULL (if that is a not allowed value for
that parameter) or if it produces a segmentation fault in that case. 

> as far as a "testing framework" is needed for GLib/Gtk+, i think it would
> be sufficient to have a pair of common testutils.[hc] files that provide:
> 
> 1- an initialization function that calls gtk_init() and preparses
>     arguments relevant for test programs. this should also make all WARNINGs
>     and CRITICALs fatal.
> 
> 2- a function to register all widget types provided by Gtk+, (useful for
>     automated testing).
> 
> 3- a function to fork off a test and assert it fails in the expected place
>     (around a certain statement).
> 
> 4- it may be helpful to have a fork-off and timeout helper function as well.
> 
> 5- simple helper macros to indicate test start/progress/assertions/end.
>     (we've at least found these useful to have in Beast.)
> 
> 6- output formatting functions to consistently present performance measurements
>     in a machine parsable manner.
> 
> 
> if i'm not mistaken, test frameworks like Check would only help us out with
> 3, 4 and to some extend 5. i don't think this warrants a new package
> dependency, especially since 5 might be highly customized and 3 or 4 could be
> useful to provide generally in GLib.
> 

I'll add here some points supporting Check ;):

As I said in another email, it wouldn't be a dependency to build GTK+,
it would be only a dependency to run the tests inside GTK+.

Check is widely used and having a standard tool for testing, instead of
doing something ad-hoc, has its advantages too.

You never know when you would need another feature for your testing. If
you use a tool maybe it provides it and if does not, you can always
request it. But if you don't use a testing tool you will always be in
the need to implement it yourself.

Anyway, I agree that the most important thing here are the tests, not
test framework. Whether you finally decide to go with Check or without
it, count on me to collaborate! :)

> also, i've spent some thoughts on the things that would be nice to have under
> automatic unit tests Gtk+:
> 
> - check layout algorithms by layouting a child widget that does nothing but
>    checking the coordinates it's layed out at. i've played around with such
>    a test item in Rapicorn. as food for thought, here's a list of the
>    properties it currently supports (assertions are carried out upon exposure):
>      MakeProperty (TestItem, epsilon,       "Epsilon",       "Epsilon within which assertions must hold",  DFLTEPS,   0,         +MAXFLOAT, 0.01, "rw"),
>      MakeProperty (TestItem, assert_left,   "Assert-Left",   "Assert positioning of the left item edge",   -INFINITY, -INFINITY, +MAXFLOAT, 3, "rw"),
>      MakeProperty (TestItem, assert_right,  "Assert-Right",  "Assert positioning of the right item edge",  -INFINITY, -INFINITY, +MAXFLOAT, 3, "rw"),
>      MakeProperty (TestItem, assert_bottom, "Assert-Bottom", "Assert positioning of the bottom item edge", -INFINITY, -INFINITY, +MAXFLOAT, 3, "rw"),
>      MakeProperty (TestItem, assert_top,    "Assert-Top",    "Assert positioning of the top item edge",    -INFINITY, -INFINITY, +MAXFLOAT, 3, "rw"),
>      MakeProperty (TestItem, assert_width,  "Assert-Width",  "Assert amount of the item width",            -INFINITY, -INFINITY, +MAXFLOAT, 3, "rw"),
>      MakeProperty (TestItem, assert_height, "Assert-Height", "Assert amount of the item height",           -INFINITY, -INFINITY, +MAXFLOAT, 3, "rw"),
>      MakeProperty (TestItem, fatal_asserts, "Fatal-Asserts", "Handle assertion failures as fatal errors",  false, "rw"),

Cool, very interesting.

> - create all widgets with mnemonic constructors and check that their
>    activation works.
> - generically query all key bindings of stock Gtk+ widgets, and activate them,
>    checking that no warnings/criticals are generated.
> - create a test rcfile covering all rcfile mechanisms that's parsed and who's
>    values are asserted in the resulting GtkStyles.
> - for all widget types, create and destroy them in a loop to:
>    a) measure basic object setup performance
>    b) catch obvious leaks
>    (these would be slowcheck/perf tests)
> 

You have done an interesting and complete job analyzing the testing
needs in Gtk+ and also giving some hints on how you think the testing
should be done. Good analysis!

Here are some bits I would like to add to this brainstorming, most of
them come from my work on the tests I've already done:

- As Federico said, I think the tests should be splitted into
independent programs that test independent components. This way,
developers making changes in one widget would be able to run only the
tests that deal with the component they are modifiying.

- I think unit tests for an interface should consider 3 cases:
   1. Test of regular values: how does the interface behave when I try
it normally.
   2. Test of limit values: how does the interface behave when I try it
with a limit value? (for example the last and the first elements of an
array, etc)
   3. Test of invalid values: how does the interface handle invalid
input? does it handle it safely or does it break completely?

- Documentation of unit tests is important. Each test case should state
what is testing with an homogeneous format. For example, these are the
headers I used in the tests I developed:

/**
 * gtk_button_new_with_label (regular)
 *  - Test 1: Create a button with a label
 *  - Test 2: Create a button with an empty label
 *  - Test 3: Create a button with a NULL label
 */

- The tests should be homogeneous, I mean, it would be nice that all
look the same. That would make it really easy to read and understand
them. It will also make it really easy to provide new tests, there could
also be a template with the main parts of a widget test file, that's
what I did while developing my tests (I attach that template in case you
like to have a look, the <> means something that needs to be inserted
there, usually something like the name of the component you are testing,
or its type, etc)

- There are some kind of tests that deal with events. Of course one can
develop some utilities to create and send those events to widgets, for
example to test key bindings, mouse clicks, etc, but maybe it would
better to use a tool like Dogtail for that purpose.

- Related to the above point, it would be nice, on top of unit tests, to
provide use case tests. I mean, not only to test interfaces, but
complete use cases for widgets. For example, one could test the complete
process of opening a file: open the open file dialog, select directory,
select a file in that directory, click Open button., etc. Dogtail would
work nice here.

- Should we test signal emission? I think they are part of the API
contract, so we should test if a function is emitting a signal when it
is supossed to emit it.

Iago.
#include <check.h>
#include <string.h>

#include "check-utils.h"
#include "gtkmain.h"
#include "<>"
#include "gtkhbox.h"


/* ---------------------------------------------------- */
/*                      GLOBALS                         */
/* ---------------------------------------------------- */

static <> *<> = NULL;

/* ---------------------------------------------------- */
/*                      FIXTURES                        */
/* ---------------------------------------------------- */

static void
fx_setup_default_gtk_<> ()
{
  int argc = 0;

  gtk_init (&argc, NULL);

  <> = GTK_<> (<>_new ());
  
  /* Check whether <> object has been created properly */
  fail_if (!GTK_IS_<> (<>), 
           "<> creation failed.");
}

static void 
fx_teardown_default_gtk_<> ()
{
  gtk_widget_destroy (GTK_WIDGET (<>));
}

/* ---------------------------------------------------- */
/*                     TEST CASES                       */
/* ---------------------------------------------------- */

/**
 * gtk_<> (regular)
 *   - Test 1:
 */
START_TEST (test_<>_regular)
{
  
}
END__TEST

/**
 * gtk_<> (invalid)
 *   - Test 1:
 */
START_TEST (test_<>_invalid)
{
  
}
END_TEST

/* ---------------------------------------------------- */
/*                  Suite creation                      */
/* ---------------------------------------------------- */

/**
 * Prepares the test suite 
 */
SRunner *
configure_tests ()
{
  SRunner *sr = NULL;
  Suite *s = NULL;

  /* Create the suite */
  s = suite_create ("Gtk<>");

  /* Create test cases */
  TCase *tc1 = tcase_create ("<>");

  /* Create unit tests for test case "<>" */
  tcase_add_checked_fixture (tc1, fx_setup_default_gtk_<>, fx_teardown_default_gtk_<>);
  tcase_add_test (tc1, test_<>_regular);
  tcase_add_test (tc1, test_<>_invalid);
  suite_add_tcase (s, tc1);

  /* Create srunner object with the test suite */
  sr = srunner_create (s);

  return sr;
}


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]