call for collaborators: bug isolation via remote program sampling

From: Ben Liblit <liblit cs berkeley edu>
To: gnome-bugsquad gnome org
Subject: call for collaborators: bug isolation via remote program sampling
Date: Wed, 23 Apr 2003 15:18:51 -0700

My collaborators and I are working on a tool for isolating bugs usingrandom sampling of user executions. This message is an open call forcollaboration. Would any of you be interested in working with us to useour bug isolation system with your GNOME projects or distributions? Ifso, please read on.

This tool is part of UC Berkeley's Open Source Quality Project<http://osq.cs.berkeley.edu/>, which investigates methods for improvingsoftware quality. In brief, our approach is to collect a littleinformation from each of many runs and look for program behaviors thatvary between successful and unsuccessful runs. For example, we mightdiscover that a program crashes when a particular function call returns-1, or when a particular array index exceeds some maximum value. Evenfor non-deterministic bugs such as heap corruption we can buildstatistical models that show behavior correlated with crashes acrossmany runs.

We've already had a few successes in controlled experimentalenvironments, including discovering a previously unreported bufferoverrun. What we'd like to do now is deploy our system with realapplications, real bugs, and real users.

Our approach works best when it can see many runs, so we need acommunity of users who are willing to run instrumented code and provideus with that raw data to mine for bugs. If you work on a project thatprovides testing binaries to willing guinea pigs, then you're someonewe'd like to work with. (Only C projects need apply, though, as that'sthe only language supported by our current implementation.)

The benefit for us is real data; the benefit for you is free help withbug hunting. The sampling is designed to be very low overhead, so theperformance penalty for you or your users will be modest (our worstexample so far runs 12% slower when instrumented, and for our bestexample the overhead is literally unmeasurable). Furthermore, oursampling-based approach implicitly "learns" the most about the bugs thathappen most often, so we may be able to give you the most usefulinformation about the bugs that are hitting the largest number of yourusers.


We've written a couple of papers about our approach:

   - "Sampling User Executions for Bug Isolation", a short position
     paper that presents the general approach and describes some
     initial experiments: <http://www.cs.berkeley.edu/~liblit/ramss/>

   - "Bug Isolation via Remote Program Sampling", a much more detailed
     writeup which describes how the instrumentation sampling works,
     measures performance impact, and gives several examples of using
     the system to track down bugs:
     <http://www.cs.berkeley.edu/~liblit/bug-isolation/>

Of course, we're also happy to discuss this with any of you on this listor in person-to-person e-mail. Our goal right now is to find real-worldcollaborators, so if you are at all interested or if you have anyquestions, please ask!


				-- Ben Liblit <liblit cs berkeley edu>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]