Re: [Evolution-hackers] missing Evolution 1.4.4 source RPMs
- From: Ben Liblit <liblit cs berkeley edu>
- To: evolution-hackers lists ximian com
- Subject: Re: [Evolution-hackers] missing Evolution 1.4.4 source RPMs
- Date: Thu, 14 Aug 2003 12:56:14 -0700
Jeffrey Stedfast wrote:
any way for us to find out more about the types of techniques you are
working on? anything that might help debug this beast would be cool :-)
In brief, the instrumented code makes a large number of wild guesses
about "interesting" behavior, and counts how often these interesting
things happen. We then use a suite of statistical machine learning
techniques to find *changes* in interesting behavior between runs that
fail (crash) and runs that succeed.
"Interesting" can be, well, anything you want to count up. Some
examples include:
- how often each function call returns <0, ==0, or >0
- how often each conditional branches left versus right
- how often each assigned variable is <, ==, or > each other
in-scope variable or source-code constant
Like I said, these are very wild guesses. Most of them are completely
irrelevant. The machine learning algorithms filter out the irrelevant
stuff in order to find the few behaviors which are strongly correlated
with success versus failure. So, for example, I might be able to tell
you that Evolution tends to crash when fetch_ldap_id() returns 0 and it
tends to not crash when fetch_ldap_id() returns >0. Or perhaps you're
tend to crash when a particular "if" condition is true, because that's a
rare case that hasn't been tested well.
Adding to the challenge is the fact that we don't even count everything
on a given run. We randomly sample perhaps 1/100 or 1/1000 of the
behavior. This helps us keep overhead down by spending most of the time
in fast uninstrumented code. It also has the side effect of improving
privacy, as we really cannot learn very much at all about any single
run. But in *aggregate*, over many runs, an fair picture of program
(mis)behavior will emerge. The statistical models we use treat the
sparse sampling as measurement noise; given enough runs, we can still
get useful bug clues from sparsely sampled data.
See <http://www.cs.berkeley.edu/~liblit/sampler/> for some packages
we've posted. We still need to write up a good non-technical primer.
Until that's done, the "Background Reading" section on that page has
pointers to papers describing how this all works in much more detail.
Now, how about those specfiles...? {taps foot} :-)
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]