Re: Gtk+ unit tests (brainstorming)

From: Iago Toral Quiroga <itoral igalia com>
To: Tim Janik <timj imendio com>
Cc: Gtk+ Developers <gtk-devel-list gnome org>
Subject: Re: Gtk+ unit tests (brainstorming)
Date: Wed, 15 Nov 2006 10:51:57 +0100
El mar, 14-11-2006 a las 15:33 +0100, Tim Janik escribió:
> > I understand your point, however I still think that being able to get a
> > wider report with all the tests failing at a given moment is also
> > interesting (for example in a buildbot continuous integration loop, like
> > the one being prepared by the build-brigade). Besides, if there is a
> > group of people that want to work on fixing bugs at the same time, they
> > would need to get a list of tests failing, not only the first one.
> 
> well, you can get that to some extend automatically, if you invoke
>    $ make -k check

Yes, if we split the tests into independent test programs I think that's
a reasonable approach.

> going beyond that would be a bad trade off i think because:

[...]

> c) implementation and test code often has dependencies that won't allow
>     to test beyond the occourance of an error. a simple example is:
>       o = object_new();
>       ASSERT (o != NULL); /* no point to continue beyond this on error */
>       test_function (o);
>     a more intimidating case is:
>       main() {
>         test_gtk_1();
>         test_gtk_2();
>         test_gtk_3();
>         test_gtk_4();
>         test_gtk_5();
>         // ...
>       }
>     if any of those test functions (say test_gtk_3) produces a gtk/glib
>     error/assertion/warning/critical, the remaining test functions (4, 5, ...)
>     are likely to fail for bogus reasons because the libraries entered
>     undefined state.
>     reports of those subsequent errors (which are likely to be very
>     misleading) is useless at best and confusing (in terms of what error really
>     matters) at worst.
>     yes, forking for each of the test functions works around that (provided
>     they are as independent of one another as in the example above), but again,
>     this complicates the test implementation (it's not an easy to understand
>     test program anymore) and debuggability, i.e. affectes the 2 main
>     properties of a good test program.

Mmm... actually, based on my experience using Check and fork mode, it
does not complicate the test implementation beyond adding a gtk_init()
to each forked test. Forking the tests is done transparently by Check
based on an environment variable. The same applies to debugging,
although it is true that debugging a forked test program is anoying, you
can disable fork mode when debugging just switching the environment
variable:

$ CK_FORK=no gdb mytest

This shouldn't be a problem if we stop test execution after the first
failed test.

> > Maybe you would like to test how the library handles invalid input. For
> > example, let's say we have a function that accepts a pointer as
> > parameter, I think it is worth knowing if that function handles safely
> > the case when that pointer is NULL (if that is a not allowed value for
> > that parameter) or if it produces a segmentation fault in that case.
> 
> no, it really doesn't make sense to test functions outside the defined
> value ranges. that's because when implementing, the only thing you need
> to actually care about from an API perspective is: the defined value ranges.
> besides that, value rtanges may compatibly be *extended* in future versions,
> which would make value range restriction tests break unecessarily.
> if a funciton is not defined for say (char*)0, adding a test that asserts
> certain behaviour for (char*)0 is effectively *extending* the current
> value range to include (char*)0 and then testing the proper implementation
> of this extended case. the outcome of which would be a CRITICAL or a segfault
> though, and with the very exception of g_critical(), *no* glib/gtk function
> implements this behaviour purposefully, compatibly, or in a documented way.
> so such a test would at best be bogus and uneccessary.

I think the main difference here between your point of view and mine is
that I'm seeing the API from the user side, while you see it from the
developer side. I explain myself:

>From a developer point of view, it is ok saying that an API function
only works within a concrete range of values and test that it really
works ok within that range. However, in practice, user programs are full
of bugs, which usually means that under certain conditions, they are not
using the APIs as they are suposed to be used. That said, any user would
prefer such API to handle those situations as safely as possible: if I'm
writing a GTK+ application I'd prefer it to safely handle a missuse on
my side and warn me about the issue, than breaking badly due to a
segmentation fault and make me lose all my data ;)

> > I'll add here some points supporting Check ;):
> 
> ok, adressing them one by one, since i see multiple reasons for not
> using Check ;)
> 
[...]
> it's not clear that Check (besides than being an additional dependency in
> itself) fullfils all the portability requirements of glib/gtk+ for these
> cases though.
> 
[...]
> - Check may be widely used, but is presented as "[...] at the moment only
>    sporadically maintained" (http://check.sourceforge.net/).
>    that alone causes me to veto any Check dependency for glib/gtk already ;)

I've just asked Chris Pickett (Check maintainer) about this issues, so
he can confirm. I'll forward his opinion in a later mail.

> > You never know when you would need another feature for your testing. If
> > you use a tool maybe it provides it and if does not, you can always
> > request it. But if you don't use a testing tool you will always be in
> > the need to implement it yourself.
> 
> as i said, this can be a plus. and testing can be forseen with considerable
> confidence, to not need specialized rocket science anytime soon.
> that being said, we can still opt to depend or integrate with any test
> "framework" out there at any future point if rocket science requirements
> indeed do emerge ;)

Yes, we can define a set of interfaces to be able to plug any another
test framework in the future with ease. It would not be a lot of work.

> > - I think unit tests for an interface should consider 3 cases:
> >   1. Test of regular values: how does the interface behave when I try
> > it normally.
> >   2. Test of limit values: how does the interface behave when I try it
> > with a limit value? (for example the last and the first elements of an
> > array, etc)
> 
> jup, with the param specs in glib, we have a good chance of covering most
> interesting limits and by random selection also many intermediate values
> for properties with ordered value ranges.

Yes, I definitely have to look into this stuff in detail.

Iago.
Follow-Ups:
- Re: Gtk+ unit tests (brainstorming)
  - From: Iago Toral Quiroga
References:
- Re: Gtk+ unit tests (brainstorming)
  - From: Tim Janik
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]