Search CORE

917 research outputs found

Semi-automatic fault localization

Author: Jones James Arthur
Publication venue: Georgia Institute of Technology
Publication date: 17/01/2008
Field of study

One of the most expensive and time-consuming components of the debugging process is locating the errors or faults. To locate faults, developers must identify statements involved in failures and select suspicious statements that might contain faults. In practice, this localization is done by developers in a tedious and manual way, using only a single execution, targeting only one fault, and having a limited perspective into a large search space. The thesis of this research is that fault localization can be partially automated with the use of commonly available dynamic information gathered from test-case executions in a way that is eﬀective, eﬃcient, tolerant of test cases that pass but also execute the fault, and scalable to large programs that potentially contain multiple faults. The overall goal of this research is to develop eﬀective and eﬃcient fault localization techniques that scale to programs of large size and with multiple faults. There are three principle steps performed to reach this goal: (1) Develop practical techniques for locating suspicious regions in a program; (2) Develop techniques to partition test suites into smaller, specialized test suites to target speciﬁc faults; and (3) Evaluate the usefulness and cost of these techniques. In this dissertation, the diﬃculties and limitations of previous work in the area of fault-localization are explored. A technique, called Tarantula, is presented that addresses these diﬃculties. Empirical evaluation of the Tarantula technique shows that it is eﬃcient and eﬀective for many faults. The evaluation also demonstrates that the Tarantula technique can loose eﬀectiveness as the number of faults increases. To address the loss of eﬀectiveness for programs with multiple faults, supporting techniques have been developed and are presented. The empirical evaluation of these supporting techniques demonstrates that they can enable eﬀective fault localization in the presence of multiple faults. A new mode of debugging, called parallel debugging, is developed and empirical evidence demonstrates that it can provide a savings in terms of both total expense and time to delivery. A prototype visualization is provided to display the fault-localization results as well as to provide a method to interact and explore those results. Finally, a study on the eﬀects of the composition of test suites on fault-localization is presented.Ph.D.Committee Chair: Harrold, Mary Jean; Committee Member: Orso, Alessandro; Committee Member: Pande, Santosh; Committee Member: Reiss, Steven; Committee Member: Rugaber, Spence

Scholarly Materials And Research @ Georgia Tech

Applying Formal Methods to Networking: Theory, Techniques and Applications

Author: Hasan Osman
Qadir Junaid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/11/2013
Field of study

Despite its great importance, modern network infrastructure is remarkable for the lack of rigor in its engineering. The Internet which began as a research experiment was never designed to handle the users and applications it hosts today. The lack of formalization of the Internet architecture meant limited abstractions and modularity, especially for the control and management planes, thus requiring for every new need a new protocol built from scratch. This led to an unwieldy ossified Internet architecture resistant to any attempts at formal verification, and an Internet culture where expediency and pragmatism are favored over formal correctness. Fortunately, recent work in the space of clean slate Internet design---especially, the software defined networking (SDN) paradigm---offers the Internet community another chance to develop the right kind of architecture and abstractions. This has also led to a great resurgence in interest of applying formal methods to specification, verification, and synthesis of networking protocols and applications. In this paper, we present a self-contained tutorial of the formidable amount of work that has been done in formal methods, and present a survey of its applications to networking.Comment: 30 pages, submitted to IEEE Communications Surveys and Tutorial

arXiv.org e-Print Archive

CiteSeerX

Approaches to Interpreter Composition

Author: Barrett Edd
Bolz Carl Friedrich
Tratt Laurence
Publication venue: 'Elsevier BV'
Publication date: 19/05/2015
Field of study

In this paper, we compose six different Python and Prolog VMs into 4 pairwise compositions: one using C interpreters; one running on the JVM; one using meta-tracing interpreters; and one using a C interpreter and a meta-tracing interpreter. We show that programs that cross the language barrier frequently execute faster in a meta-tracing composition, and that meta-tracing imposes a significantly lower overhead on composed programs relative to mono-language programs.Comment: 33 pages, 1 figure, 9 table

arXiv.org e-Print Archive

CiteSeerX

King's Research Portal

Recommended from our members

Automated synthesis and debugging of declarative models in alloy

Author: Wang Kaiyuan
Publication venue
Publication date: 12/12/2018
Field of study

In theory, formal specifications offer numerous benefits in developing more reliable software. In practice however, the use of specifications is rather limited, and practitioners often consider them more trouble than they are worth. Indeed, manually writing detailed specifications using notations that have unfamiliar syntax and semantics can be a daunting task -- even for experienced programmers. We introduce a new automated approach for synthesis of desired specifications and debugging of faulty specifications using given examples that capture the essence of desired properties and serve as test cases. Our focus is specifications written in the declarative language Alloy -- a first-order logic based on relations with transitive closure, and its SAT-based analysis engine. Our key insight is that a test-driven foundation enables modern approaches to synthesis and debugging of imperative code to serve as a basis for developing novel analogous techniques for declarative specifications. For synthesis, we build on equivalence in relational algebra and introduce techniques for generating candidate Alloy expressions. We also introduce a technique to complete a partial Alloy model with holes using constraint solving. For locating faults in buggy specifications, we build on mutation-based fault localization and introduce techniques for locating likely faulty nodes in the abstract syntax tree of the faulty specification. Moreover, we integrate our expression generation and fault localization techniques to introduce a technique for automated specification repair. We experimentally evaluate our techniques using several Alloy models as subjects, including those with real faults. The results show that our techniques are effective at synthesis and debugging of the subjects. We believe our techniques provide an important step towards increasing the role of formal specifications in developing more reliable software and realizing their promise.Electrical and Computer Engineerin

Texas ScholarWorks

Recommended from our members

Enhancing Usability and Explainability of Data Systems

Author: Fariha Anna
Publication venue: ScholarWorks@UMass Amherst
Publication date: 20/10/2021
Field of study

The recent growth of data science expanded its reach to an ever-growing user base of nonexperts, increasing the need for usability, understandability, and explainability in these systems. Enhancing usability makes data systems accessible to people with different skills and backgrounds alike, leading to democratization of data systems. Furthermore, proper understanding of data and data-driven systems is necessary for the users to trust the function of the systems that learn from data. Finally, data systems should be transparent: when a data system behaves unexpectedly or malfunctions, the users deserve proper explanation of what caused the observed incident. Unfortunately, most existing data systems offer limited usability and support for explanations: these systems are usable only by experts with sound technical skills, and even expert users are hindered by the lack of transparency into the systems\u27 inner workings and functions. The aim of my thesis is to bridge the usability gap between nonexpert users and complex data systems, aid all sort of users, including the expert ones, in data and system understanding, and provide explanations that help reason about unexpected outcomes involving data systems. Specifically, my thesis has the following three goals: (1) enhancing usability of data systems for nonexperts, (2) enable data understanding that can assist users in a variety of tasks such as achieving trust in data-driven machine learning, gaining data understanding, and data cleaning, and (3) explaining causes of unexpected outcomes involving data and data systems. For enhancing usability, we focus on example-driven user intent discovery. We develop systems based on example-driven interactions in two different settings: querying relational databases and personalized document summarization. Towards data understanding, we develop a new data-profiling primitive that can characterize tuples for which a machine-learned model is likely to produce untrustworthy predictions. We also develop an explanation framework to explain causes of such untrustworthy predictions. Additionally, this new data-profiling primitive enables interactive data cleaning. Finally, we develop two explanation frameworks, tailored to provide explanations in debugging data system components, including the data itself. The explanation frameworks focus on explaining the root cause of a concurrent application\u27s intermittent failure and exposing issues in the data that cause a data-driven system to malfunction

ScholarWorks@UMass Amherst

Combining Fault Localization with Information Retrieval: an Analysis of Accuracy and Performance for Bug Finding

Author: Nasarudin Nadhratunnaim
Publication venue
Publication date: 01/04/2023
Field of study

Debugging is a key activity in the software development process. It has been used extensively by developers to attempt to localize faults, while enhancing the quality and performance of software in general. There has been a significant amount of study in developing and enhancing fault localization techniques, which are used in assisting developers to locate faults within a body of code. However, identifying fault locations using individual techniques is not always effective; combining different techniques, which represent distinct forms of analysis, might help to overcome this issue. There has been a very limited amount of research that suggests that combining more than one approach to fault localization may have benefits, principally because information from different sources is included in the localization process. In this thesis, I attempt to more precisely address the question of whether combining different fault localization techniques can more effectively and efficiently find faults in code, when contrasted with a single technique. To answer this, I have carried out experiments that combine the use of three fault localization techniques: Information Retrieval (IR), Spectrum Based Fault Localization (SBFL), and Text Based Search. These techniques are representative of both dynamic and static fault localization. My hypothesis is that a combination of dynamic and static fault localization analysis can assist developers in better fault localization. I have evaluated the various combinations of techniques in identifying faults against real-world programs, Defects4j, where 395 faults and bug reports have been analyzed. The experimental results demonstrate that the combination of three techniques (SBFL, Text Search, and IR) is the most accurate, with 86.84% accuracy for 343 faults located from a total of 395. This finding contributes positively towards concretely recommending techniques for assisting developers in locating faults in code. Guidelines are provided on which combination of techniques, with maximal accuracy of result, should be applied especially when there is no prior knowledge about the fault

White Rose E-theses Online

Automating Program Verification and Repair Using Invariant Analysis and Test Input Generation

Author: Nguyen Thanh V
Publication venue: UNM Digital Repository
Publication date: 01/05/2010
Field of study

Software bugs are a persistent feature of daily life---crashing web browsers, allowing cyberattacks, and distorting the results of scientific computations. One approach to improving software uses program invariants---mathematical descriptions of program behaviors---to verify code and detect bugs. Current invariant generation techniques lack support for complex yet important forms of invariants, such as general polynomial relations and properties of arrays. As a result, we lack the ability to conduct precise analysis of programs that use this common data structure. This dissertation presents DIG, a static and dynamic analysis framework for discovering several useful classes of program invariants, including (i) nonlinear polynomial relations, which are fundamental to many scientific applications; disjunctive invariants, (ii) which express branching behaviors in programs; and (iii) properties about multidimensional arrays, which appear in many practical applications. We describe theoretical and empirical results showing that DIG can efficiently and accurately find many important invariants in real-world uses, e.g., polynomial properties in numerical algorithms and array relations in a full AES encryption implementation. Automatic program verification and synthesis are long-standing problems in computer science. However, there has been a lot of work on program verification and less so on program synthesis. Consequently, important synthesis tasks, e.g., generating program repairs, remain difficult and time-consuming. This dissertation proves that certain formulations of verification and synthesis are equivalent, allowing for direct applications of techniques and tools between these two research areas. Based on these ideas, we develop CETI, a tool that leverages existing verification techniques and tools for automatic program repair. Experimental results show that CETI can have higher success rates than many other standard program repair methods

Doctor of Philosophy

Author: Burtsev Anton
Publication venue: University of Utah
Publication date: 01/05/2013
Field of study

dissertationA modern software system is a composition of parts that are themselves highly complex: operating systems, middleware, libraries, servers, and so on. In principle, compositionality of interfaces means that we can understand any given module independently of the internal workings of other parts. In practice, however, abstractions are leaky, and with every generation, modern software systems grow in complexity. Traditional ways of understanding failures, explaining anomalous executions, and analyzing performance are reaching their limits in the face of emergent behavior, unrepeatability, cross-component execution, software aging, and adversarial changes to the system at run time. Deterministic systems analysis has a potential to change the way we analyze and debug software systems. Recorded once, the execution of the system becomes an independent artifact, which can be analyzed offline. The availability of the complete system state, the guaranteed behavior of re-execution, and the absence of limitations on the run-time complexity of analysis collectively enable the deep, iterative, and automatic exploration of the dynamic properties of the system. This work creates a foundation for making deterministic replay a ubiquitous system analysis tool. It defines design and engineering principles for building fast and practical replay machines capable of capturing complete execution of the entire operating system with an overhead of several percents, on a realistic workload, and with minimal installation costs. To enable an intuitive interface of constructing replay analysis tools, this work implements a powerful virtual machine introspection layer that enables an analysis algorithm to be programmed against the state of the recorded system through familiar terms of source-level variable and type names. To support performance analysis, the replay engine provides a faithful performance model of the original execution during replay

The University of Utah: J. Willard Marriott Digital Library