Re: RFC: GLib testing framework

From: Behdad Esfahbod <behdad behdad org>
To: Tim Janik <timj imendio com>
Cc: Gtk+ Developers <gtk-devel-list gnome org>
Subject: Re: RFC: GLib testing framework
Date: Thu, 01 Nov 2007 14:20:01 -0400
Hi Tim,

Thanks for the nice writeup.

In cairo land we've got some new requirements for our test suite over
time.  I try to summarize our current infrastructure and requirements.
Most of these was mentioned in the thread last year, but I repeat so you
can double-check your model if you wish.


Tests:

- We have two kind of tests, performance tests and regression tests.
Performance tests while intended for performance testing, are also run
as part of make check (with a number of iterations of 1, so they are
fast), because they make for good regression test cases too.

- We have ~150 regression tests and indeed linking time is painful.
Runtime is not short either.  Our individual tests are all fast, but
each test is run against about 20 different targets (image, xlib, ps,
pdf, svg, special test surface, all of those once in rgb24 and once in
argb32 mode, ..., each producing output and log files)   So far we've
put each test in its own source file and that's part of the problem.
We've hit limitations with total number of command line arguments in our
"make clean" target multiple times!  So, grouping several tests in a
single source file is useful and encouraged.

- At times we need to just "run all text tests".  We don't have any way
to do that.  Grouping all text tests in a single file helps.  I see your
proposal already handles grouping.


Invocation:

- We have two main make targets for testing:

  * make test: runs the entire test suite and creates an html output
showing which tests passed and which failed, with links to log files and
showing image-diff of the failed tests.  Would be good to have enough
flexibility to do things like this in glib's testing framework.  That
is, plugging our own html generator, etc.

  * make retest: runs all the tests that indicate a failure in their log
files and generate html of only those tests.  This is extremely useful.

- We need a mode to invoke several tests from multiple threads, possibly
to a number of iterations, possibly with random order.  Chris Wilson
started doing that and found several mt issues.


Parameters:

- As I briefly mentioned, we have tests, and we have targets.  Each test
is run against all targets.  We have a number of ways to control which
tests and targets to test.  The general syntax is something like:

  make test TESTS=test1,test2,test3 TARGETS=target1,target2

There are also env vars controlling those.  For example, this one comes
handy at release time when we don't want to let experimental backends
make releasing fail:

  CAIRO_TEST_TARGET=target1,target2 make distcheck

Chris even added recently:

  CAIRO_TEST_TARGET_EXCLUDE=bad-target make distcheck

The target matching is hierarchical.  That is, a target of glitz matches
glitz-glx-rgb24.

- We also have targets to run the test suite under valgrind.


What we don't have yet:

- The test+target model is a simple example of multidimensional testing
that can go even further.  For example, cairo has 14 different operators
and 5 different drawing operations, and four different source types.
One can write a meta-test that tests all combinations of above plus
different combinations of a pattern matrix (all combinations of
translate, scale, rotate).  Such a test though probably needs its own
driver and report generation tool, so may well be off the limits of a
testing framework.  Thought I mention.

Still, would be good to be able to support the test+target model.  It's
useful in pango too, as well as any other project having multiple
backends.  Note that not all tests should run against individual targets
though.  Some tests are independent and should be run once.


Regards,

behdad



On Thu, 2007-11-01 at 08:25 -0400, Tim Janik wrote:
> hi All.
> 
> with lots of help from Sven Herzberg and others, i've designed a unit test
> framework for GLib, which Sven and i plan to implement in the following weeks.
> i'll post about testing framework/utilities for Gtk+ at a later point.
> feedback is greatly appreciated.
> 
> QUICK READERS: two short test code examples are at the end of this mail;
>                 the proposed test framework API is attached as testapi.h;
>                 more elaborate example code is attached as testapi.c.
> 
> 
> to present the main design requirements up front:
> 
> - the best place for a testing framework is libglib.so. that way, the framework
>    can reuse lots of logic already supplied by glib, and also all of glib can
>    benefit from it. i.e. tests under glib/glib/TESTDIR can't use libgobject.so
>    for example. if people want some GObject-ified API at some point (which
>    i'm not currently planning to implement), this can be put into libgobject.so
>    since it depends on libglib.so.
> 
> - tests should run on all systems (to ease portability testing and debugging),
>    not just selected developer machines. that way they can aid in bug reporting.
>    it's already a nuisance that many developers need an extra tree for docu
>    building, so a testing framework should not come with additional or optional
>    dependencies for  basic functionality.
>    (this basically rules out optional use of external testing frameworks.)
>    also, external test frameworks easily become unmaintained over long periods
>    (sourceforge and google are full of examples) or at some point introduce
>    arbitrary dependencies. GLib/Gtk+ cannot afford that, putting the framework
>    into glib itself also ensures that it's properly maintained.
> 
> - the testing framework must be very simple and understandable to allow every
>    developer easy access to debugging of other peoples code.
>    E.g. having magic prefix and trailing macros around test functions, makes
>    reading, understanding and debugging of test code harder and should be
>    avoided. (E.g. Check has this, and we consider it a showstopper)
> 
> - by default, tests shouldn't be forked-off by for performance reasons and
>    because forks are hard to debug. (nevertheless, we'll introduce API to
>    allow explicit forking where test require this functionality.)
> 
> - the test framework must make it easy to run single tests isolated and shell
>    into gdb (this is particularly tricky when many tests are compiled into the
>    same binary to reduce link time.)
> 
> - unit tests need to be quick to be useful, otherwise people don't use them,
>    the testing framework needs to allow for that:
>      http://blogs.gnome.org/timj/2006/10/23/23102006-beast-and-unit-testing/trackback/
> 
> - linking tests or test suites should be fast. other projects (cairo, beast)
>    have made the experience that linking test functions into many isolated
>    programs or shared libraries wastes too much build time as the suites grow.
> 
> - there is also a place for performance tests and slow but thorough tests.
>    however to not interfere with frequent quick tests, those probably should
>    run as part of different Makefile rules or part of a different test mode.
> 
> - many more detailed requirements are listed in the summary of last years
>    testing framework discussion:
>      http://mail.gnome.org/archives/gtk-devel-list/2006-November/msg00039.html
>    one thing that became clear in that discussion is that many people out
>    there want full testing reports that list the number of failed tests
>    instead of abort-on-first-failure behavior, so the framework should allow
>    for this.
> 
> 
> we've identified these main testing scenarios that should be well covered:
> 
> 1) it should be easy and very quick to run a subset of the test suite, so
>     developers can run relevant tests before every commit. e.g. allow:
>       make -C gtk+/gtk/testtreeview test
> 
> 2) something unknown broke or seems fishy. run the whole test suite as
>     fast as possible, to roughly spot the area of concern:
>       make -C gtk+/ test-report
> 
> 3) we can't stand having a spare machine idling around.
>     run the whole test suite thoroughly over night (includeing lengthy brute
>     force tests, performance tests, everything):
>       make -C gtk+/ full-report
> 
> 4) we're interested in performance changes, because a critical core component
>     changed, or we want to compare performance between two releases:
>       make -C gtk+/ perf-report
> 
> 5) run the test suite as part of make distcheck. this is probably best
>     achieved by hooking up "make test" to "make check" and let automake
>     take care of the rest.
> 
> as said, these are *main* scenarios, sub variants like:
>    make -C gtk+/gtk/testtreeview perf-report
> and easily running test cases in gdb should of course also be possible.
> 
> 
> the API is for the most part designed according to established concepts found
> in the xUnit testing framework family (JUnit, NUnit, RUnit) see [1],
> which in turn is based on smalltalk unit testing concepts [2].
> that is:
> 
> - tests (a test method) are grouped together with their fixture into
>    test case objects: GTestCase
> 
> - a test fixture consists of fixture data and setup and teardown methods to
>    establish the environment for the test functions.
>    we use fresh fixtures, i.e. fixtures are newly set up and torn down around
>    each test invokation to avoid dependencies between tests.
> 
> - test cases can be grouped into test suites (GTestSuite), to allow subsets
>    of the available tests to be run.
>    suites can be grouped into other suites as well.
> 
> - we provide an extended set of assertions for strings, ints and floats
>    that allow printing of assertion arguments upon failures to reduce
>    the need for debugging:
>      g_assert_cmpfloat (arg1, cmpop, arg2);
>      g_assert_cmpint   (arg1, cmpop, arg2);
>      g_assert_cmpstr   (arg1, cmpop, arg2);
>    used like:
>      g_assert_cmpstr ("foo", !=, "faa");
>      g_assert_cmpfloat (3.3, <, epsilon);
>    g_assert() is still available of course, but using the above variants,
>    assertion messages can be more elaborate, e.g.:
>      ** testing.c:test_assertions(): assertion failed '(3.3 < epsilon)': (3.3 < 0.5)
> 
> - we'll provide a test binary wrapper that will take care of creating
>    machine readable test reports.
> 
> - we intend to implement test report post processing tools, e.g. to
>    generate html charts of performance test results.
> 
> 
> to make up for the lack of introspection support in C (like Java/JUnit have it)
> we make use of preprocessor macros and a hierarchical naming scheme:
> 
> - test cases and test suites have to be given names, so individual tests
>    can be adressed on the command line (e.g. to debug a specific test within
>    a huge test binary). this way, test cases or suites within a testing
>    binary can be referred to via pathnames:
>      /gtksuite/treeviewsuite/columnsuite/test text cell renderer
>      /gtksuite/windowsuite/test window title
>    etc.
> 
> - test suites can be created implicitely by registering a new test case
>    with it's full pathname. this avoids fiddling with lots of case/suite
>    object references just to establish hierarchical grouping, e.g.:
>      g_test_add_func ("/misc/assertions", test_assertions);
>    registers the test function test_assertions as test case "assertions"
>    and implicitely creates the test suite "misc".
> 
> - assertions make use of __FILE__, __LINE__, __PRETTY_FUNCTION__ to provide
>    elaborate error messages.
> 
> - in the spirit of g_new() and g_slice_new(), g_test_add() accepts a fixture
>    type argument, to provide type safety for test functions that take a fixture
>    data argument.
> 
> - planned test runner interface:
>      > gtester --help
>      Usage: gtester [OPTIONS] testbinary [testbinaries...]
>      Run all test binaries and produce a report file gtester.log.
>      Options:
>        -l                List all available testpaths of a testbinary.
>        -p <testpath>     Run test suites and cases below <testpath>.
>        -m <mode>         Run tests in mode <mode>. Possible modes:
>                            perf  - run performance measurements,
>                            slow  - run tests that need lots of time,
>                            quick - run tests quickly (default).
>        -o <logfile>      Save testing log as <logfile>.
>        -k, --keep-going  Do not abort upon first test failure.
>        -q, --quiet       Quiet, suppres output from test binaries.
>        --seed <rand>     Random seed, specify this to force repeatable
>                          test results using g_test_rand_range().
> 
> - note that similar to gtester, testbinaries will themselves also support
>    -l, -m, -p and --seed, so individual tests can be debugged in gdb without
>    starting gtester.
> 
> - the testbinary and gtester
> 
> since this email is quite long already, for the rest, i'll just
> introduce the shortest possible test program and an extended example:
> 
> ==============shortest-test-program==============
> static void
> test_number_assertion (void)
> {
>     g_assert_cmpint (4, ==, 2 + 2);
> }
> int
> main (int   argc,
>        char *argv[])
> {
>     g_test_init (&argc, &argv, NULL);
>     g_test_add_func ("/misc/number assertion", test_number_assertion);
>     return g_test_run();
> }
> ==============
> 
> 
> the following is a test program that maintains a test specific fixture
> around running a test case:
> 
> ==============fixture-test-program==============
> typedef struct {
>     gchar *string;
> } Stringtest;
> static void
> stringtest_setup (Stringtest *fix)
> {
>     fix->string = g_strdup ("foo");
> }
> static void
> stringtest_test (Stringtest *fix)
> {
>     g_assert_cmpstr (fix->string, ==, "foo");
> }
> static void
> stringtest_teardown (Stringtest *fix)
> {
>     g_free (fix->string);
> }
> int
> main (int   argc,
>        char *argv[])
> {
>     g_test_init (&argc, &argv, NULL);
>     g_test_add ("/misc/stringtest", Stringtest, // <- fixture type
>                 stringtest_setup, stringtest_test, stringtest_teardown);
>     return g_test_run();
> }
> ==============
> 
> conceptually, the Stringtest structure and the stringtest_setup, stringtest_test
> and stringtest_teardown functions (plus assertion macros if you may), equate the
> TestCase object introduced in JUnit [3] and many other frameworks. the proposed
> type safe version of g_test_add() is the shortest and most convenient notation
> i could come up with to implement the concept in plain C.
> apart from library dependency issues, this _could_ be implemented as GObject or
> a GInterface, however without providing significant benefits (a GType ID
> isn't needed here, and we'd not want to do cross library boundary inheritance
> of test case objects).
> 
> 
> i've attached two files:
> 
> testapi.h	intended public API additions to implement in libglib.so
>  		(except for g_test_queue_unref).
> 
> testapi.c	example code, showing off how the API from testapi.h is
>  		intended to be used. note that main2() reimplements the
>  		functionality from main(), albeit avoiding the testpath
>  		convenience API. this demonstrates the maintenance savings
>  		by using the testpath API.
> 
> given the above scope description and terminology introductions, the API in
> testapi.h should be self revealing. where it isn't, please come back to me
> with your questions.
> 
> 
> [1] "xUnit Test Patterns: Refactoring Test Code" by Gerard Meszaros.
>      book:    http://www.amazon.com/dp/0131495054
>      wbesite: http://xunitpatterns.com/
>      (the website is really good as it described many patterns
>      introduced in the book)
> 
> [2] "Simple Smalltalk Testing: With Patterns" by Kent Beck.
>      link: http://www.xprogramming.com/testfram.htm
> 
> [3] "JUnit Test Infected: Programmers Love Writing Tests" by Eric Gamma and
>      Kent Beck
>      link: http://junit.sourceforge.net/doc/testinfected/testing.htm
>      (Excellent introduction to automated unit testing)
> 
> ---
> ciaoTJ
> _______________________________________________ gtk-devel-list mailing list gtk-devel-list gnome org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
-- 
behdad
http://behdad.org/

"Those who would give up Essential Liberty to purchase a little
 Temporary Safety, deserve neither Liberty nor Safety."
        -- Benjamin Franklin, 1759
References:
- RFC: GLib testing framework
  - From: Tim Janik
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]