6,757 research outputs found

    Learning Tractable Probabilistic Models for Fault Localization

    Full text link
    In recent years, several probabilistic techniques have been applied to various debugging problems. However, most existing probabilistic debugging systems use relatively simple statistical models, and fail to generalize across multiple programs. In this work, we propose Tractable Fault Localization Models (TFLMs) that can be learned from data, and probabilistically infer the location of the bug. While most previous statistical debugging methods generalize over many executions of a single program, TFLMs are trained on a corpus of previously seen buggy programs, and learn to identify recurring patterns of bugs. Widely-used fault localization techniques such as TARANTULA evaluate the suspiciousness of each line in isolation; in contrast, a TFLM defines a joint probability distribution over buggy indicator variables for each line. Joint distributions with rich dependency structure are often computationally intractable; TFLMs avoid this by exploiting recent developments in tractable probabilistic models (specifically, Relational SPNs). Further, TFLMs can incorporate additional sources of information, including coverage-based features such as TARANTULA. We evaluate the fault localization performance of TFLMs that include TARANTULA scores as features in the probabilistic model. Our study shows that the learned TFLMs isolate bugs more effectively than previous statistical methods or using TARANTULA directly.Comment: Fifth International Workshop on Statistical Relational AI (StaR-AI 2015

    You Cannot Fix What You Cannot Find! An Investigation of Fault Localization Bias in Benchmarking Automated Program Repair Systems

    Get PDF
    Properly benchmarking Automated Program Repair (APR) systems should contribute to the development and adoption of the research outputs by practitioners. To that end, the research community must ensure that it reaches significant milestones by reliably comparing state-of-the-art tools for a better understanding of their strengths and weaknesses. In this work, we identify and investigate a practical bias caused by the fault localization (FL) step in a repair pipeline. We propose to highlight the different fault localization configurations used in the literature, and their impact on APR systems when applied to the Defects4J benchmark. Then, we explore the performance variations that can be achieved by `tweaking' the FL step. Eventually, we expect to create a new momentum for (1) full disclosure of APR experimental procedures with respect to FL, (2) realistic expectations of repairing bugs in Defects4J, as well as (3) reliable performance comparison among the state-of-the-art APR systems, and against the baseline performance results of our thoroughly assessed kPAR repair tool. Our main findings include: (a) only a subset of Defects4J bugs can be currently localized by commonly-used FL techniques; (b) current practice of comparing state-of-the-art APR systems (i.e., counting the number of fixed bugs) is potentially misleading due to the bias of FL configurations; and (c) APR authors do not properly qualify their performance achievement with respect to the different tuning parameters implemented in APR systems.Comment: Accepted by ICST 201

    A novel traveling-wave-based method improved by unsupervised learning for fault location of power cables via sheath current monitoring

    Get PDF
    In order to improve the practice in maintenance of power cables, this paper proposes a novel traveling-wave-based fault location method improved by unsupervised learning. The improvement mainly lies in the identification of the arrival time of the traveling wave. The proposed approach consists of four steps: (1) The traveling wave associated with the sheath currents of the cables are grouped in a matrix; (2) the use of dimensionality reduction by t-SNE (t-distributed Stochastic Neighbor Embedding) to reconstruct the matrix features in a low dimension; (3) application of the DBSCAN (density-based spatial clustering of applications with noise) clustering to cluster the sample points by the closeness of the sample distribution; (4) the arrival time of the traveling wave can be identified by searching for the maximum slope point of the non-noise cluster with the fewest samples. Simulations and calculations have been carried out for both HV (high voltage) and MV (medium voltage) cables. Results indicate that the arrival time of the traveling wave can be identified for both HV cables and MV cables with/without noise, and the method is suitable with few random time errors of the recorded data. A lab-based experiment was carried out to validate the proposed method and helped to prove the effectiveness of the clustering and the fault location
    • …
    corecore