4,346 research outputs found

    An Empirical Study of Fault Localization in Python Programs

    Full text link
    Despite its massive popularity as a programming language, especially in novel domains like data science programs, there is comparatively little research about fault localization that targets Python. Even though it is plausible that several findings about programming languages like C/C++ and Java -- the most common choices for fault localization research -- carry over to other languages, whether the dynamic nature of Python and how the language is used in practice affect the capabilities of classic fault localization approaches remain open questions to investigate. This paper is the first large-scale empirical study of fault localization on real-world Python programs and faults. Using Zou et al.'s recent large-scale empirical study of fault localization in Java as the basis of our study, we investigated the effectiveness (i.e., localization accuracy), efficiency (i.e., runtime performance), and other features (e.g., different entity granularities) of seven well-known fault-localization techniques in four families (spectrum-based, mutation-based, predicate switching, and stack-trace based) on 135 faults from 13 open-source Python projects from the BugsInPy curated collection. The results replicate for Python several results known about Java, and shed light on whether Python's peculiarities affect the capabilities of fault localization. The replication package that accompanies this paper includes detailed data about our experiments, as well as the tool FauxPy that we implemented to conduct the study.Comment: Related work update

    Rigorous statistical detection and characterization of a deviation from the Gutenberg-Richter distribution above magnitude 8 in subduction zones

    Full text link
    We present a quantitative statistical test for the presence of a crossover c0 in the Gutenberg-Richter distribution of earthquake seismic moments, separating the usual power law regime for seismic moments less than c0 from another faster decaying regime beyond c0. Our method is based on the transformation of the ordered sample of seismic moments into a series with uniform distribution under condition of no crossover. The bootstrap method allows us to estimate the statistical significance of the null hypothesis H0 of an absence of crossover (c0=infinity). When H0 is rejected, we estimate the crossover c0 using two different competing models for the second regime beyond c0 and the bootstrap method. For the catalog obtained by aggregating 14 subduction zones of the Circum Pacific Seismic Belt, our estimate of the crossover point is log(c0) =28.14 +- 0.40 (c0 in dyne-cm), corresponding to a crossover magnitude mW=8.1 +- 0.3. For separate subduction zones, the corresponding estimates are much more uncertain, so that the null hypothesis of an identical crossover for all subduction zones cannot be rejected. Such a large value of the crossover magnitude makes it difficult to associate it directly with a seismogenic thickness as proposed by many different authors in the past. Our measure of c0 may substantiate the concept that the localization of strong shear deformation could propagate significantly in the lower crust and upper mantle, thus increasing the effective size beyond which one should expect a change of regime.Comment: pdf document of 40 pages including 5 tables and 19 figure

    Automated metamorphic testing on the analyses of feature models

    Get PDF
    Copyright © 2010 Elsevier B.V. All rights reserved.Context: A feature model (FM) represents the valid combinations of features in a domain. The automated extraction of information from FMs is a complex task that involves numerous analysis operations, techniques and tools. Current testing methods in this context are manual and rely on the ability of the tester to decide whether the output of an analysis is correct. However, this is acknowledged to be time-consuming, error-prone and in most cases infeasible due to the combinatorial complexity of the analyses, this is known as the oracle problem.Objective: In this paper, we propose using metamorphic testing to automate the generation of test data for feature model analysis tools overcoming the oracle problem. An automated test data generator is presented and evaluated to show the feasibility of our approach.Method: We present a set of relations (so-called metamorphic relations) between input FMs and the set of products they represent. Based on these relations and given a FM and its known set of products, a set of neighbouring FMs together with their corresponding set of products are automatically generated and used for testing multiple analyses. Complex FMs representing millions of products can be efficiently created by applying this process iteratively.Results: Our evaluation results using mutation testing and real faults reveal that most faults can be automatically detected within a few seconds. Two defects were found in FaMa and another two in SPLOT, two real tools for the automated analysis of feature models. Also, we show how our generator outperforms a related manual suite for the automated analysis of feature models and how this suite can be used to guide the automated generation of test cases obtaining important gains in efficiency.Conclusion: Our results show that the application of metamorphic testing in the domain of automated analysis of feature models is efficient and effective in detecting most faults in a few seconds without the need for a human oracle.This work has been partially supported by the European Commission(FEDER)and Spanish Government under CICYT project SETI(TIN2009-07366)and the Andalusian Government project ISABEL(TIC-2533)

    Spectrum-based feature localization: A case study using ArgoUML

    Get PDF
    Feature localization (FL) is a basic activity in re-engineering legacy systems into software product lines. In this work, we explore the use of the Spectrum-based localization technique for this task. This technique is traditionally used for fault localization but with practical applications in other tasks like the dynamic FL approach that we propose. The ArgoUML SPL benchmark is used as a case study and we compare it with a previous hybrid (static and dynamic) approach from which we reuse the manual and testing execution traces of the features. We conclude that it is feasible and sound to use the Spectrum-based approach providing promising results in the benchmark metrics

    Locating Faults with Program Slicing: An Empirical Analysis

    Get PDF
    Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best performing statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors)
    corecore