917 research outputs found

    Semi-automatic fault localization

    Get PDF
    One of the most expensive and time-consuming components of the debugging process is locating the errors or faults. To locate faults, developers must identify statements involved in failures and select suspicious statements that might contain faults. In practice, this localization is done by developers in a tedious and manual way, using only a single execution, targeting only one fault, and having a limited perspective into a large search space. The thesis of this research is that fault localization can be partially automated with the use of commonly available dynamic information gathered from test-case executions in a way that is eļ¬€ective, eļ¬ƒcient, tolerant of test cases that pass but also execute the fault, and scalable to large programs that potentially contain multiple faults. The overall goal of this research is to develop eļ¬€ective and eļ¬ƒcient fault localization techniques that scale to programs of large size and with multiple faults. There are three principle steps performed to reach this goal: (1) Develop practical techniques for locating suspicious regions in a program; (2) Develop techniques to partition test suites into smaller, specialized test suites to target speciļ¬c faults; and (3) Evaluate the usefulness and cost of these techniques. In this dissertation, the diļ¬ƒculties and limitations of previous work in the area of fault-localization are explored. A technique, called Tarantula, is presented that addresses these diļ¬ƒculties. Empirical evaluation of the Tarantula technique shows that it is eļ¬ƒcient and eļ¬€ective for many faults. The evaluation also demonstrates that the Tarantula technique can loose eļ¬€ectiveness as the number of faults increases. To address the loss of eļ¬€ectiveness for programs with multiple faults, supporting techniques have been developed and are presented. The empirical evaluation of these supporting techniques demonstrates that they can enable eļ¬€ective fault localization in the presence of multiple faults. A new mode of debugging, called parallel debugging, is developed and empirical evidence demonstrates that it can provide a savings in terms of both total expense and time to delivery. A prototype visualization is provided to display the fault-localization results as well as to provide a method to interact and explore those results. Finally, a study on the eļ¬€ects of the composition of test suites on fault-localization is presented.Ph.D.Committee Chair: Harrold, Mary Jean; Committee Member: Orso, Alessandro; Committee Member: Pande, Santosh; Committee Member: Reiss, Steven; Committee Member: Rugaber, Spence

    Applying Formal Methods to Networking: Theory, Techniques and Applications

    Full text link
    Despite its great importance, modern network infrastructure is remarkable for the lack of rigor in its engineering. The Internet which began as a research experiment was never designed to handle the users and applications it hosts today. The lack of formalization of the Internet architecture meant limited abstractions and modularity, especially for the control and management planes, thus requiring for every new need a new protocol built from scratch. This led to an unwieldy ossified Internet architecture resistant to any attempts at formal verification, and an Internet culture where expediency and pragmatism are favored over formal correctness. Fortunately, recent work in the space of clean slate Internet design---especially, the software defined networking (SDN) paradigm---offers the Internet community another chance to develop the right kind of architecture and abstractions. This has also led to a great resurgence in interest of applying formal methods to specification, verification, and synthesis of networking protocols and applications. In this paper, we present a self-contained tutorial of the formidable amount of work that has been done in formal methods, and present a survey of its applications to networking.Comment: 30 pages, submitted to IEEE Communications Surveys and Tutorial

    Approaches to Interpreter Composition

    Get PDF
    In this paper, we compose six different Python and Prolog VMs into 4 pairwise compositions: one using C interpreters; one running on the JVM; one using meta-tracing interpreters; and one using a C interpreter and a meta-tracing interpreter. We show that programs that cross the language barrier frequently execute faster in a meta-tracing composition, and that meta-tracing imposes a significantly lower overhead on composed programs relative to mono-language programs.Comment: 33 pages, 1 figure, 9 table

    Combining Fault Localization with Information Retrieval: an Analysis of Accuracy and Performance for Bug Finding

    Get PDF
    Debugging is a key activity in the software development process. It has been used extensively by developers to attempt to localize faults, while enhancing the quality and performance of software in general. There has been a significant amount of study in developing and enhancing fault localization techniques, which are used in assisting developers to locate faults within a body of code. However, identifying fault locations using individual techniques is not always effective; combining different techniques, which represent distinct forms of analysis, might help to overcome this issue. There has been a very limited amount of research that suggests that combining more than one approach to fault localization may have benefits, principally because information from different sources is included in the localization process. In this thesis, I attempt to more precisely address the question of whether combining different fault localization techniques can more effectively and efficiently find faults in code, when contrasted with a single technique. To answer this, I have carried out experiments that combine the use of three fault localization techniques: Information Retrieval (IR), Spectrum Based Fault Localization (SBFL), and Text Based Search. These techniques are representative of both dynamic and static fault localization. My hypothesis is that a combination of dynamic and static fault localization analysis can assist developers in better fault localization. I have evaluated the various combinations of techniques in identifying faults against real-world programs, Defects4j, where 395 faults and bug reports have been analyzed. The experimental results demonstrate that the combination of three techniques (SBFL, Text Search, and IR) is the most accurate, with 86.84% accuracy for 343 faults located from a total of 395. This finding contributes positively towards concretely recommending techniques for assisting developers in locating faults in code. Guidelines are provided on which combination of techniques, with maximal accuracy of result, should be applied especially when there is no prior knowledge about the fault

    Automating Program Verification and Repair Using Invariant Analysis and Test Input Generation

    Get PDF
    Software bugs are a persistent feature of daily life---crashing web browsers, allowing cyberattacks, and distorting the results of scientific computations. One approach to improving software uses program invariants---mathematical descriptions of program behaviors---to verify code and detect bugs. Current invariant generation techniques lack support for complex yet important forms of invariants, such as general polynomial relations and properties of arrays. As a result, we lack the ability to conduct precise analysis of programs that use this common data structure. This dissertation presents DIG, a static and dynamic analysis framework for discovering several useful classes of program invariants, including (i) nonlinear polynomial relations, which are fundamental to many scientific applications; disjunctive invariants, (ii) which express branching behaviors in programs; and (iii) properties about multidimensional arrays, which appear in many practical applications. We describe theoretical and empirical results showing that DIG can efficiently and accurately find many important invariants in real-world uses, e.g., polynomial properties in numerical algorithms and array relations in a full AES encryption implementation. Automatic program verification and synthesis are long-standing problems in computer science. However, there has been a lot of work on program verification and less so on program synthesis. Consequently, important synthesis tasks, e.g., generating program repairs, remain difficult and time-consuming. This dissertation proves that certain formulations of verification and synthesis are equivalent, allowing for direct applications of techniques and tools between these two research areas. Based on these ideas, we develop CETI, a tool that leverages existing verification techniques and tools for automatic program repair. Experimental results show that CETI can have higher success rates than many other standard program repair methods

    Doctor of Philosophy

    Get PDF
    dissertationA modern software system is a composition of parts that are themselves highly complex: operating systems, middleware, libraries, servers, and so on. In principle, compositionality of interfaces means that we can understand any given module independently of the internal workings of other parts. In practice, however, abstractions are leaky, and with every generation, modern software systems grow in complexity. Traditional ways of understanding failures, explaining anomalous executions, and analyzing performance are reaching their limits in the face of emergent behavior, unrepeatability, cross-component execution, software aging, and adversarial changes to the system at run time. Deterministic systems analysis has a potential to change the way we analyze and debug software systems. Recorded once, the execution of the system becomes an independent artifact, which can be analyzed offline. The availability of the complete system state, the guaranteed behavior of re-execution, and the absence of limitations on the run-time complexity of analysis collectively enable the deep, iterative, and automatic exploration of the dynamic properties of the system. This work creates a foundation for making deterministic replay a ubiquitous system analysis tool. It defines design and engineering principles for building fast and practical replay machines capable of capturing complete execution of the entire operating system with an overhead of several percents, on a realistic workload, and with minimal installation costs. To enable an intuitive interface of constructing replay analysis tools, this work implements a powerful virtual machine introspection layer that enables an analysis algorithm to be programmed against the state of the recorded system through familiar terms of source-level variable and type names. To support performance analysis, the replay engine provides a faithful performance model of the original execution during replay
    • ā€¦
    corecore