call for collaborators: bug isolation via remote program sampling
- From: Ben Liblit <liblit cs berkeley edu>
- To: gnome-bugsquad gnome org
- Subject: call for collaborators: bug isolation via remote program sampling
- Date: Wed, 23 Apr 2003 15:18:51 -0700
My collaborators and I are working on a tool for isolating bugs using
random sampling of user executions. This message is an open call for
collaboration. Would any of you be interested in working with us to use
our bug isolation system with your GNOME projects or distributions? If
so, please read on.
This tool is part of UC Berkeley's Open Source Quality Project
<http://osq.cs.berkeley.edu/>, which investigates methods for improving
software quality. In brief, our approach is to collect a little
information from each of many runs and look for program behaviors that
vary between successful and unsuccessful runs. For example, we might
discover that a program crashes when a particular function call returns
-1, or when a particular array index exceeds some maximum value. Even
for non-deterministic bugs such as heap corruption we can build
statistical models that show behavior correlated with crashes across
many runs.
We've already had a few successes in controlled experimental
environments, including discovering a previously unreported buffer
overrun. What we'd like to do now is deploy our system with real
applications, real bugs, and real users.
Our approach works best when it can see many runs, so we need a
community of users who are willing to run instrumented code and provide
us with that raw data to mine for bugs. If you work on a project that
provides testing binaries to willing guinea pigs, then you're someone
we'd like to work with. (Only C projects need apply, though, as that's
the only language supported by our current implementation.)
The benefit for us is real data; the benefit for you is free help with
bug hunting. The sampling is designed to be very low overhead, so the
performance penalty for you or your users will be modest (our worst
example so far runs 12% slower when instrumented, and for our best
example the overhead is literally unmeasurable). Furthermore, our
sampling-based approach implicitly "learns" the most about the bugs that
happen most often, so we may be able to give you the most useful
information about the bugs that are hitting the largest number of your
users.
We've written a couple of papers about our approach:
- "Sampling User Executions for Bug Isolation", a short position
paper that presents the general approach and describes some
initial experiments: <http://www.cs.berkeley.edu/~liblit/ramss/>
- "Bug Isolation via Remote Program Sampling", a much more detailed
writeup which describes how the instrumentation sampling works,
measures performance impact, and gives several examples of using
the system to track down bugs:
<http://www.cs.berkeley.edu/~liblit/bug-isolation/>
Of course, we're also happy to discuss this with any of you on this list
or in person-to-person e-mail. Our goal right now is to find real-world
collaborators, so if you are at all interested or if you have any
questions, please ask!
-- Ben Liblit <liblit cs berkeley edu>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]