4,445 research outputs found
Test Case Purification for Improving Fault Localization
Finding and fixing bugs are time-consuming activities in software
development. Spectrum-based fault localization aims to identify the faulty
position in source code based on the execution trace of test cases. Failing
test cases and their assertions form test oracles for the failing behavior of
the system under analysis. In this paper, we propose a novel concept of
spectrum driven test case purification for improving fault localization. The
goal of test case purification is to separate existing test cases into small
fractions (called purified test cases) and to enhance the test oracles to
further localize faults. Combining with an original fault localization
technique (e.g., Tarantula), test case purification results in better ranking
the program statements. Our experiments on 1800 faults in six open-source Java
programs show that test case purification can effectively improve existing
fault localization techniques
Learning Tractable Probabilistic Models for Fault Localization
In recent years, several probabilistic techniques have been applied to
various debugging problems. However, most existing probabilistic debugging
systems use relatively simple statistical models, and fail to generalize across
multiple programs. In this work, we propose Tractable Fault Localization Models
(TFLMs) that can be learned from data, and probabilistically infer the location
of the bug. While most previous statistical debugging methods generalize over
many executions of a single program, TFLMs are trained on a corpus of
previously seen buggy programs, and learn to identify recurring patterns of
bugs. Widely-used fault localization techniques such as TARANTULA evaluate the
suspiciousness of each line in isolation; in contrast, a TFLM defines a joint
probability distribution over buggy indicator variables for each line. Joint
distributions with rich dependency structure are often computationally
intractable; TFLMs avoid this by exploiting recent developments in tractable
probabilistic models (specifically, Relational SPNs). Further, TFLMs can
incorporate additional sources of information, including coverage-based
features such as TARANTULA. We evaluate the fault localization performance of
TFLMs that include TARANTULA scores as features in the probabilistic model. Our
study shows that the learned TFLMs isolate bugs more effectively than previous
statistical methods or using TARANTULA directly.Comment: Fifth International Workshop on Statistical Relational AI (StaR-AI
2015
Semi-automatic fault localization
One of the most expensive and time-consuming components of the debugging
process is locating the errors or faults. To locate faults, developers must identify
statements involved in failures and select suspicious statements that might contain
faults. In practice, this localization is done by developers in a tedious and manual
way, using only a single execution, targeting only one fault, and having a limited
perspective into a large search space.
The thesis of this research is that fault localization can be partially automated
with the use of commonly available dynamic information gathered from test-case
executions in a way that is effective, efficient, tolerant of test cases that pass but also
execute the fault, and scalable to large programs that potentially contain multiple
faults. The overall goal of this research is to develop effective and efficient fault
localization techniques that scale to programs of large size and with multiple faults.
There are three principle steps performed to reach this goal: (1) Develop practical
techniques for locating suspicious regions in a program; (2) Develop techniques to
partition test suites into smaller, specialized test suites to target specific faults; and
(3) Evaluate the usefulness and cost of these techniques.
In this dissertation, the difficulties and limitations of previous work in the area
of fault-localization are explored. A technique, called Tarantula, is presented that
addresses these difficulties. Empirical evaluation of the Tarantula technique shows
that it is efficient and effective for many faults. The evaluation also demonstrates
that the Tarantula technique can loose effectiveness as the number of faults increases.
To address the loss of effectiveness for programs with multiple faults, supporting
techniques have been developed and are presented. The empirical evaluation of these
supporting techniques demonstrates that they can enable effective fault localization in
the presence of multiple faults. A new mode of debugging, called parallel debugging, is
developed and empirical evidence demonstrates that it can provide a savings in terms
of both total expense and time to delivery. A prototype visualization is provided to
display the fault-localization results as well as to provide a method to interact and
explore those results. Finally, a study on the effects of the composition of test suites
on fault-localization is presented.Ph.D.Committee Chair: Harrold, Mary Jean; Committee Member: Orso, Alessandro; Committee Member: Pande, Santosh; Committee Member: Reiss, Steven; Committee Member: Rugaber, Spence
Time-Space Efficient Regression Testing for Configurable Systems
Configurable systems are those that can be adapted from a set of options.
They are prevalent and testing them is important and challenging. Existing
approaches for testing configurable systems are either unsound (i.e., they can
miss fault-revealing configurations) or do not scale. This paper proposes
EvoSPLat, a regression testing technique for configurable systems. EvoSPLat
builds on our previously-developed technique, SPLat, which explores all
dynamically reachable configurations from a test. EvoSPLat is tuned for two
scenarios of use in regression testing: Regression Configuration Selection
(RCS) and Regression Test Selection (RTS). EvoSPLat for RCS prunes
configurations (not tests) that are not impacted by changes whereas EvoSPLat
for RTS prunes tests (not configurations) which are not impacted by changes.
Handling both scenarios in the context of evolution is important. Experimental
results show that EvoSPLat is promising. We observed a substantial reduction in
time (22%) and in the number of configurations (45%) for configurable Java
programs. In a case study on a large real-world configurable system (GCC),
EvoSPLat reduced 35% of the running time. Comparing EvoSPLat with sampling
techniques, 2-wise was the most efficient technique, but it missed two bugs
whereas EvoSPLat detected all bugs four times faster than 6-wise, on average.Comment: 14 page
- …