4,241 research outputs found
UNIT-LEVEL ISOLATION AND TESTING OF BUGGY CODE
In real-world software development, maintenance plays a major role and developers spend 50-80% of their time in maintenance-related activities. During software maintenance, a significant amount of effort is spent on ending and fixing bugs. In some cases, the fix does not completely eliminate the buggy behavior; though it addresses the reported problem, it fails to account for conditions that could lead to similar failures. There could be many possible reasons: the conditions may have been overlooked or difficult to reproduce, e.g., when the components that invoke the code or the underlying components it interacts with can not put it in a state where latent errors appear. We posit that such latent errors can be discovered sooner if the buggy section can be tested more thoroughly in a separate environment, a strategy that is loosely analogous to the medical procedure of performing a biopsy where tissue is removed, examined and subjected to a battery of tests to determine the presence of a disease.
In this thesis, we propose a process in which the buggy code is extracted and isolated in a test framework. Test drivers and stubs are added to exercise the code and observe its interactions with its dependencies. We lay the groundwork for the creation of an automated tool for isolating code by studying its feasibility and investigating existing testing technologies that can facilitate the creation of such drivers and stubs. We investigate mocking frameworks, symbolic execution and model checking tools and test their capabilities by examining real bugs from the Apache Tomcat project. We demonstrate the merits of performing unit-level symbolic execution and model checking to discover runtime exceptions and logical errors. The process is shown to have high coverage and able to uncover latent errors due to insufficient fixes
Automated Fixing of Programs with Contracts
This paper describes AutoFix, an automatic debugging technique that can fix
faults in general-purpose software. To provide high-quality fix suggestions and
to enable automation of the whole debugging process, AutoFix relies on the
presence of simple specification elements in the form of contracts (such as
pre- and postconditions). Using contracts enhances the precision of dynamic
analysis techniques for fault detection and localization, and for validating
fixes. The only required user input to the AutoFix supporting tool is then a
faulty program annotated with contracts; the tool produces a collection of
validated fixes for the fault ranked according to an estimate of their
suitability.
In an extensive experimental evaluation, we applied AutoFix to over 200
faults in four code bases of different maturity and quality (of implementation
and of contracts). AutoFix successfully fixed 42% of the faults, producing, in
the majority of cases, corrections of quality comparable to those competent
programmers would write; the used computational resources were modest, with an
average time per fix below 20 minutes on commodity hardware. These figures
compare favorably to the state of the art in automated program fixing, and
demonstrate that the AutoFix approach is successfully applicable to reduce the
debugging burden in real-world scenarios.Comment: Minor changes after proofreadin
Recommended from our members
Detecting, Isolating and Enforcing Dependencies Between and Within Test Cases
Testing stateful applications is challenging, as it can be difficult to identify hidden dependencies on program state. These dependencies may manifest between several test cases, or simply within a single test case. When it's left to developers to document, understand, and respond to these dependencies, a mistake can result in unexpected and invalid test results. Although current testing infrastructure does not currently leverage state dependency information, we argue that it could, and that by doing so testing can be improved. Our results thus far show that by recovering dependencies between test cases and modifying the popular testing framework, JUnit, to utilize this information, we can optimize the testing process, reducing time needed to run tests by 62% on average. Our ongoing work is to apply similar analyses to improve existing state of the art test suite prioritization techniques and state of the art test case generation techniques. This work is advised by Professor Gail Kaiser
Cause Clue Clauses: Error Localization using Maximum Satisfiability
Much effort is spent everyday by programmers in trying to reduce long,
failing execution traces to the cause of the error. We present a new algorithm
for error cause localization based on a reduction to the maximal satisfiability
problem (MAX-SAT), which asks what is the maximum number of clauses of a
Boolean formula that can be simultaneously satisfied by an assignment. At an
intuitive level, our algorithm takes as input a program and a failing test, and
comprises the following three steps. First, using symbolic execution, we encode
a trace of a program as a Boolean trace formula which is satisfiable iff the
trace is feasible. Second, for a failing program execution (e.g., one that
violates an assertion or a post-condition), we construct an unsatisfiable
formula by taking the trace formula and additionally asserting that the input
is the failing test and that the assertion condition does hold at the end.
Third, using MAX-SAT, we find a maximal set of clauses in this formula that can
be satisfied together, and output the complement set as a potential cause of
the error. We have implemented our algorithm in a tool called bug-assist for C
programs. We demonstrate the surprising effectiveness of the tool on a set of
benchmark examples with injected faults, and show that in most cases,
bug-assist can quickly and precisely isolate the exact few lines of code whose
change eliminates the error. We also demonstrate how our algorithm can be
modified to automatically suggest fixes for common classes of errors such as
off-by-one.Comment: The pre-alpha version of the tool can be downloaded from
http://bugassist.mpi-sws.or
Learning Tractable Probabilistic Models for Fault Localization
In recent years, several probabilistic techniques have been applied to
various debugging problems. However, most existing probabilistic debugging
systems use relatively simple statistical models, and fail to generalize across
multiple programs. In this work, we propose Tractable Fault Localization Models
(TFLMs) that can be learned from data, and probabilistically infer the location
of the bug. While most previous statistical debugging methods generalize over
many executions of a single program, TFLMs are trained on a corpus of
previously seen buggy programs, and learn to identify recurring patterns of
bugs. Widely-used fault localization techniques such as TARANTULA evaluate the
suspiciousness of each line in isolation; in contrast, a TFLM defines a joint
probability distribution over buggy indicator variables for each line. Joint
distributions with rich dependency structure are often computationally
intractable; TFLMs avoid this by exploiting recent developments in tractable
probabilistic models (specifically, Relational SPNs). Further, TFLMs can
incorporate additional sources of information, including coverage-based
features such as TARANTULA. We evaluate the fault localization performance of
TFLMs that include TARANTULA scores as features in the probabilistic model. Our
study shows that the learned TFLMs isolate bugs more effectively than previous
statistical methods or using TARANTULA directly.Comment: Fifth International Workshop on Statistical Relational AI (StaR-AI
2015
Finding Regressions in Projects under Version Control Systems
Version Control Systems (VCS) are frequently used to support development of
large-scale software projects. A typical VCS repository of a large project can
contain various intertwined branches consisting of a large number of commits.
If some kind of unwanted behaviour (e.g. a bug in the code) is found in the
project, it is desirable to find the commit that introduced it. Such commit is
called a regression point. There are two main issues regarding the regression
points. First, detecting whether the project after a certain commit is correct
can be very expensive as it may include large-scale testing and/or some other
forms of verification. It is thus desirable to minimise the number of such
queries. Second, there can be several regression points preceding the actual
commit; perhaps a bug was introduced in a certain commit, inadvertently fixed
several commits later, and then reintroduced in a yet later commit. In order to
fix the actual commit it is usually desirable to find the latest regression
point.
The currently used distributed VCS contain methods for regression
identification, see e.g. the git bisect tool. In this paper, we present a new
regression identification algorithm that outperforms the current tools by
decreasing the number of validity queries. At the same time, our algorithm
tends to find the latest regression points which is a feature that is missing
in the state-of-the-art algorithms. The paper provides an experimental
evaluation of the proposed algorithm and compares it to the state-of-the-art
tool git bisect on a real data set
- …