4,241 research outputs found

    UNIT-LEVEL ISOLATION AND TESTING OF BUGGY CODE

    Get PDF
    In real-world software development, maintenance plays a major role and developers spend 50-80% of their time in maintenance-related activities. During software maintenance, a significant amount of effort is spent on ending and fixing bugs. In some cases, the fix does not completely eliminate the buggy behavior; though it addresses the reported problem, it fails to account for conditions that could lead to similar failures. There could be many possible reasons: the conditions may have been overlooked or difficult to reproduce, e.g., when the components that invoke the code or the underlying components it interacts with can not put it in a state where latent errors appear. We posit that such latent errors can be discovered sooner if the buggy section can be tested more thoroughly in a separate environment, a strategy that is loosely analogous to the medical procedure of performing a biopsy where tissue is removed, examined and subjected to a battery of tests to determine the presence of a disease. In this thesis, we propose a process in which the buggy code is extracted and isolated in a test framework. Test drivers and stubs are added to exercise the code and observe its interactions with its dependencies. We lay the groundwork for the creation of an automated tool for isolating code by studying its feasibility and investigating existing testing technologies that can facilitate the creation of such drivers and stubs. We investigate mocking frameworks, symbolic execution and model checking tools and test their capabilities by examining real bugs from the Apache Tomcat project. We demonstrate the merits of performing unit-level symbolic execution and model checking to discover runtime exceptions and logical errors. The process is shown to have high coverage and able to uncover latent errors due to insufficient fixes

    Automated Fixing of Programs with Contracts

    Full text link
    This paper describes AutoFix, an automatic debugging technique that can fix faults in general-purpose software. To provide high-quality fix suggestions and to enable automation of the whole debugging process, AutoFix relies on the presence of simple specification elements in the form of contracts (such as pre- and postconditions). Using contracts enhances the precision of dynamic analysis techniques for fault detection and localization, and for validating fixes. The only required user input to the AutoFix supporting tool is then a faulty program annotated with contracts; the tool produces a collection of validated fixes for the fault ranked according to an estimate of their suitability. In an extensive experimental evaluation, we applied AutoFix to over 200 faults in four code bases of different maturity and quality (of implementation and of contracts). AutoFix successfully fixed 42% of the faults, producing, in the majority of cases, corrections of quality comparable to those competent programmers would write; the used computational resources were modest, with an average time per fix below 20 minutes on commodity hardware. These figures compare favorably to the state of the art in automated program fixing, and demonstrate that the AutoFix approach is successfully applicable to reduce the debugging burden in real-world scenarios.Comment: Minor changes after proofreadin

    Cause Clue Clauses: Error Localization using Maximum Satisfiability

    Full text link
    Much effort is spent everyday by programmers in trying to reduce long, failing execution traces to the cause of the error. We present a new algorithm for error cause localization based on a reduction to the maximal satisfiability problem (MAX-SAT), which asks what is the maximum number of clauses of a Boolean formula that can be simultaneously satisfied by an assignment. At an intuitive level, our algorithm takes as input a program and a failing test, and comprises the following three steps. First, using symbolic execution, we encode a trace of a program as a Boolean trace formula which is satisfiable iff the trace is feasible. Second, for a failing program execution (e.g., one that violates an assertion or a post-condition), we construct an unsatisfiable formula by taking the trace formula and additionally asserting that the input is the failing test and that the assertion condition does hold at the end. Third, using MAX-SAT, we find a maximal set of clauses in this formula that can be satisfied together, and output the complement set as a potential cause of the error. We have implemented our algorithm in a tool called bug-assist for C programs. We demonstrate the surprising effectiveness of the tool on a set of benchmark examples with injected faults, and show that in most cases, bug-assist can quickly and precisely isolate the exact few lines of code whose change eliminates the error. We also demonstrate how our algorithm can be modified to automatically suggest fixes for common classes of errors such as off-by-one.Comment: The pre-alpha version of the tool can be downloaded from http://bugassist.mpi-sws.or

    Learning Tractable Probabilistic Models for Fault Localization

    Full text link
    In recent years, several probabilistic techniques have been applied to various debugging problems. However, most existing probabilistic debugging systems use relatively simple statistical models, and fail to generalize across multiple programs. In this work, we propose Tractable Fault Localization Models (TFLMs) that can be learned from data, and probabilistically infer the location of the bug. While most previous statistical debugging methods generalize over many executions of a single program, TFLMs are trained on a corpus of previously seen buggy programs, and learn to identify recurring patterns of bugs. Widely-used fault localization techniques such as TARANTULA evaluate the suspiciousness of each line in isolation; in contrast, a TFLM defines a joint probability distribution over buggy indicator variables for each line. Joint distributions with rich dependency structure are often computationally intractable; TFLMs avoid this by exploiting recent developments in tractable probabilistic models (specifically, Relational SPNs). Further, TFLMs can incorporate additional sources of information, including coverage-based features such as TARANTULA. We evaluate the fault localization performance of TFLMs that include TARANTULA scores as features in the probabilistic model. Our study shows that the learned TFLMs isolate bugs more effectively than previous statistical methods or using TARANTULA directly.Comment: Fifth International Workshop on Statistical Relational AI (StaR-AI 2015

    Finding Regressions in Projects under Version Control Systems

    Full text link
    Version Control Systems (VCS) are frequently used to support development of large-scale software projects. A typical VCS repository of a large project can contain various intertwined branches consisting of a large number of commits. If some kind of unwanted behaviour (e.g. a bug in the code) is found in the project, it is desirable to find the commit that introduced it. Such commit is called a regression point. There are two main issues regarding the regression points. First, detecting whether the project after a certain commit is correct can be very expensive as it may include large-scale testing and/or some other forms of verification. It is thus desirable to minimise the number of such queries. Second, there can be several regression points preceding the actual commit; perhaps a bug was introduced in a certain commit, inadvertently fixed several commits later, and then reintroduced in a yet later commit. In order to fix the actual commit it is usually desirable to find the latest regression point. The currently used distributed VCS contain methods for regression identification, see e.g. the git bisect tool. In this paper, we present a new regression identification algorithm that outperforms the current tools by decreasing the number of validity queries. At the same time, our algorithm tends to find the latest regression points which is a feature that is missing in the state-of-the-art algorithms. The paper provides an experimental evaluation of the proposed algorithm and compares it to the state-of-the-art tool git bisect on a real data set
    • …
    corecore