95 research outputs found

    Spectrum-based fault localization in software product lines

    Get PDF
    Context: Software Product Line (SPL) testing is challenging mainly due to the potentially huge number of products under test. Most of the research on this field focuses on making testing affordable by selecting a representative subset of products to be tested. However, once the tests are executed and some failures revealed, debugging is a cumbersome and time consuming task due to difficulty to localize and isolate the faulty features in the SPL. Objective: This paper presents a debugging approach for the localization of bugs in SPLs. Method: The proposed approach works in two steps. First, the features of the SPL are ranked according to their suspiciousness (i.e., likelihood of being faulty) using spectrum-based localization techniques. Then, a novel fault isolation approach is used to generate valid products of minimum size containing the most suspicious features, helping to isolate the cause of failures. Results: For the evaluation of our approach, we compared ten suspiciousness techniques on nine SPLs of different sizes. The results reveal that three of the techniques (Tarantula, Kulcynski2 and Ample2) stand out over the rest, showing a stable performance with different types of faults and product suite sizes. By using these metrics, faults were localized by examining between 0.1% and 14.4% of the feature sets. Conclusion: Our results show that the proposed approach is effective at locating bugs in SPLs, serving as a helpful complement for the numerous approaches for testing SPLs.Ministerio de Economía y Competitividad TIN2015-70560-R (BELI)Junta de Andalucía P12-TIC-1867 (COPAS

    Learning Tractable Probabilistic Models for Fault Localization

    Full text link
    In recent years, several probabilistic techniques have been applied to various debugging problems. However, most existing probabilistic debugging systems use relatively simple statistical models, and fail to generalize across multiple programs. In this work, we propose Tractable Fault Localization Models (TFLMs) that can be learned from data, and probabilistically infer the location of the bug. While most previous statistical debugging methods generalize over many executions of a single program, TFLMs are trained on a corpus of previously seen buggy programs, and learn to identify recurring patterns of bugs. Widely-used fault localization techniques such as TARANTULA evaluate the suspiciousness of each line in isolation; in contrast, a TFLM defines a joint probability distribution over buggy indicator variables for each line. Joint distributions with rich dependency structure are often computationally intractable; TFLMs avoid this by exploiting recent developments in tractable probabilistic models (specifically, Relational SPNs). Further, TFLMs can incorporate additional sources of information, including coverage-based features such as TARANTULA. We evaluate the fault localization performance of TFLMs that include TARANTULA scores as features in the probabilistic model. Our study shows that the learned TFLMs isolate bugs more effectively than previous statistical methods or using TARANTULA directly.Comment: Fifth International Workshop on Statistical Relational AI (StaR-AI 2015

    A Comprehensive Empirical Investigation on Failure Clustering in Parallel Debugging

    Full text link
    The clustering technique has attracted a lot of attention as a promising strategy for parallel debugging in multi-fault scenarios, this heuristic approach (i.e., failure indexing or fault isolation) enables developers to perform multiple debugging tasks simultaneously through dividing failed test cases into several disjoint groups. When using statement ranking representation to model failures for better clustering, several factors influence clustering effectiveness, including the risk evaluation formula (REF), the number of faults (NOF), the fault type (FT), and the number of successful test cases paired with one individual failed test case (NSP1F). In this paper, we present the first comprehensive empirical study of how these four factors influence clustering effectiveness. We conduct extensive controlled experiments on 1060 faulty versions of 228 simulated faults and 141 real faults, and the results reveal that: 1) GP19 is highly competitive across all REFs, 2) clustering effectiveness decreases as NOF increases, 3) higher clustering effectiveness is easier to achieve when a program contains only predicate faults, and 4) clustering effectiveness remains when the scale of NSP1F is reduced to 20%

    Using hardware performance counters for fault localization

    Get PDF
    In this work, we leverage hardware performance counters-collected data as abstraction mechanisms for program executions and use these abstractions to identify likely causes of failures. Our approach can be summarized as follows: Hardware counters-based data is collected from both successful and failed executions, the data collected from the successful executions is used to create normal behavior models of programs, and deviations from these models observed in failed executions are scored and reported as likely causes of failures. The results of our experiments conducted on three open source projects suggest that the proposed approach can effectively prioritize the space of likely causes of failures, which can in turn improve the turn around time for defect fixes
    corecore