8 research outputs found

    Evaluating Lexical Approximation of Program Dependence

    Get PDF
    Complex dependence analysis typically provides an underpinning approximation of true program dependence. We investigate the effectiveness of using lexical information to approximate such dependence, introducing two new deletion operators to Observation-Based Slicing (ORBS). ORBS provides direct observation of program dependence, computing a slice using iterative, speculative deletion of program parts. Deletions become permanent if they do not affect the slicing criterion. The original ORBS uses a bounded deletion window operator that attempts to delete consecutive lines together. Our new deletion operators attempt to delete multiple, non-contiguous lines that are lexically similar to each other. We evaluate the lexical dependence approximation by exploring the trade-off between the precision and the speed of dependence analysis performed with new deletion operators. The deletion operators are evaluated independently, as well as collectively via a novel generalization of ORBS that exploits multiple deletion operators: Multi-operator Observation-Based Slicing (MOBS). An empirical evaluation using three Java projects, six C projects, and one multi-lingual project written in Python and C finds that the lexical information provides a useful approximation to the underlying dependence. On average, MOBS can delete 69% of lines deleted by the original ORBS, while taking only 36% of the wall clock time required by ORBS

    Delta debugging microservice systems with parallel optimization

    Get PDF

    A comparison of tree- and line-oriented observational slicing

    Get PDF
    Observation-based slicing and its generalization observational slicing are recently-introduced, language-independent dynamic slicing techniques. They both construct slices based on the dependencies observed during program execution, rather than static or dynamic dependence analysis. The original implementation of the observation-based slicing algorithm used lines of source code as its program representation. A recent variation, developed to slice modelling languages (such as Simulink), used an XML representation of an executable model. We ported the XML slicer to source code by constructing a tree representation of traditional source code through the use of srcML. This work compares the tree- and line-based slicers using four experiments involving twenty different programs, ranging from classic benchmarks to million-line production systems. The resulting slices are essentially the same size for the majority of the programs and are often identical. However, structural constraints imposed by the tree representation sometimes force the slicer to retain enclosing control structures. It can also “bog down” trying to delete single-token subtrees. This occasionally makes the tree-based slices larger and the tree-based slicer slower than a parallelised version of the line-based slicer. In addition, a Java versus C comparison finds that the two languages lead to similar slices, but Java code takes noticeably longer to slice. The initial experiments suggest two improvements to the tree-based slicer: the addition of a size threshold, for ignoring small subtrees, and subtree replacement. The former enables the slicer to run 3.4 times faster while producing slices that are only about 9% larger. At the same time the subtree replacement reduces size by about 8–12% and allows the tree-based slicer to produce more natural slices

    Using contextual knowledge in interactive fault localization

    Get PDF
    Tool support for automated fault localization in program debugging is limited because state-of-the-art algorithms often fail to provide efficient help to the user. They usually offer a ranked list of suspicious code elements, but the fault is not guaranteed to be found among the highest ranks. In Spectrum-Based Fault Localization (SBFL) – which uses code coverage information of test cases and their execution outcomes to calculate the ranks –, the developer has to investigate several locations before finding the faulty code element. Yet, all the knowledge she a priori has or acquires during this process is not reused by the SBFL tool. There are existing approaches in which the developer interacts with the SBFL algorithm by giving feedback on the elements of the prioritized list. We propose a new approach called iFL which extends interactive approaches by exploiting contextual knowledge of the user about the next item in the ranked list (e. g., a statement), with which larger code entities (e. g., a whole function) can be repositioned in their suspiciousness. We implemented a closely related algorithm proposed by Gong et al. , called Talk . First, we evaluated iFL using simulated users, and compared the results to SBFL and Talk . Next, we introduced two types of imperfections in the simulation: user’s knowledge and confidence levels. On SIR and Defects4J, results showed notable improvements in fault localization efficiency, even with strong user imperfections. We then empirically evaluated the effectiveness of the approach with real users in two sets of experiments: a quantitative evaluation of the successfulness of using iFL , and a qualitative evaluation of practical uses of the approach with experienced developers in think-aloud sessions

    Using contextual knowledge in interactive fault localization

    Get PDF
    Tool support for automated fault localization in program debugging is limited because state-of-the-art algorithms often fail to provide efficient help to the user. They usually offer a ranked list of suspicious code elements, but the fault is not guaranteed to be found among the highest ranks. In Spectrum-Based Fault Localization (SBFL) – which uses code coverage information of test cases and their execution outcomes to calculate the ranks –, the developer has to investigate several locations before finding the faulty code element. Yet, all the knowledge she a priori has or acquires during this process is not reused by the SBFL tool. There are existing approaches in which the developer interacts with the SBFL algorithm by giving feedback on the elements of the prioritized list. We propose a new approach called iFL which extends interactive approaches by exploiting contextual knowledge of the user about the next item in the ranked list (e. g., a statement), with which larger code entities (e. g., a whole function) can be repositioned in their suspiciousness. We implemented a closely related algorithm proposed by Gong et al. , called Talk . First, we evaluated iFL using simulated users, and compared the results to SBFL and Talk . Next, we introduced two types of imperfections in the simulation: user’s knowledge and confidence levels. On SIR and Defects4J, results showed notable improvements in fault localization efficiency, even with strong user imperfections. We then empirically evaluated the effectiveness of the approach with real users in two sets of experiments: a quantitative evaluation of the successfulness of using iFL , and a qualitative evaluation of practical uses of the approach with experienced developers in think-aloud sessions

    Provenance, Incremental Evaluation, and Debugging in Datalog

    Get PDF
    The Datalog programming language has recently found increasing traction in research and industry. Driven by its clean declarative semantics, along with its conciseness and ease of use, Datalog has been adopted for a wide range of important applications, such as program analysis, graph problems, and networking. To enable this adoption, modern Datalog engines have implemented advanced language features and high-performance evaluation of Datalog programs. Unfortunately, critical infrastructure and tooling to support Datalog users and developers are still missing. For example, there are only limited tools addressing the crucial debugging problem, where developers can spend up to 30% of their time finding and fixing bugs. This thesis addresses Datalog’s tooling gaps, with the ultimate goal of improving the productivity of Datalog programmers. The first contribution is centered around the critical problem of debugging: we develop a new debugging approach that explains the execution steps taken to produce a faulty output. Crucially, our debugging method can be applied for large-scale applications without substantially sacrificing performance. The second contribution addresses the problem of incremental evaluation, which is necessary when program inputs change slightly, and results need to be recomputed. Incremental evaluation allows this recomputation to happen more efficiently, without discarding the previous results and recomputing from scratch. Finally, the last contribution provides a new incremental debugging approach that identifies the root causes of faulty outputs that occur after an incremental evaluation. Incremental debugging focuses on the relationship between input and output and can provide debugging suggestions to amend the inputs so that faults no longer occur. These techniques, in combination, form a corpus of critical infrastructure and tooling developments for Datalog, allowing developers and users to use Datalog more productively

    Coarse Hierarchical Delta Debugging

    No full text
    corecore