7,960 research outputs found

    Predicting Software Defects with Causality Tests

    Get PDF
    In this paper, we propose a defect prediction approach centered on more robust evidences towards causality between source code metrics (as predictors) and the occurrence of defects. More specifically, we rely on the Granger Causality Test to evaluate whether past variations in source code metrics values can be used to forecast changes in a time series of defects. Our approach triggers alarms when changes made to the source code of a target system have a high chance of producing defects. We evaluated our approach in several life stages of four Java-based systems. We reached an average precision of 50% in three out of the four systems we evaluated. Moreover, by comparing our approach with baselines that are not based on causality tests, it achieved a better precision

    BugMaps-Granger: a tool for visualizing and predicting bugs using Granger causality tests

    Get PDF
    International audienceBackgroundDespite the increasing number of bug analysis tools for exploring bugs in software systems, there are no tools supporting the investigation of causality relationships between internal quality metrics and bugs. In this paper, we propose an extension of the BugMaps tool called BugMaps-Granger that allows the analysis of source code properties that are more likely to cause bugs. For this purpose, we relied on the Granger Causality Test to evaluate whether past changes to a given time series of source code metrics can be used to forecast changes in a time series of defects. Our tool extracts source code versions from version control platforms, calculates source code metrics and defects time series, computes Granger Test results, and provides interactive visualizations for causal analysis of bugs.ResultsWe provide an example of use of BugMaps-Granger involving data from the Equinox Framework and Eclipse JDT Core systems collected during three years. For these systems, the tool was able to identify the modules with more bugs, the average lifetime and complexity of the bugs, and the source code properties that are more likely to cause bugs.ConclusionsWith the results provided by the tool in hand, a maintainer can perform at least two main software quality assurance activities: (a) refactoring the source code properties that Granger-caused bugs and (b) improving unit tests coverage in classes with more bugs

    Uncovering Causal Relationships between Software Metrics and Bugs

    Get PDF
    International audienceBug prediction is an important challenge for software engineering research. It consist in looking for possible early indicators of the presence of bugs in a software. However, despite the relevance of the issue, most experiments designed to evaluate bug prediction only investigate whether there is a linear relation between the predictor and the presence of bugs. However, it is well known that standard regression models can not filter out spurious relations. Therefore, in this paper we describe an experiment to discover more robust evidences towards causality between software metrics (as predictors) and the occurrence of bugs. For this purpose, we have relied on Granger Causality Test to evaluate whether past changes in a given time series are useful to forecast changes in another series. As its name suggests, Granger Test is a better indication of causality between two variables. We present and discuss the results of experiments on four real world systems evaluated over a time frame of almost four years. Particularly, we have been able to discover in the history of metrics the causes - in the terms of the Granger Test - for 64% to 93% of the defects reported for the systems considered in our experiment

    Towards Automated Performance Bug Identification in Python

    Full text link
    Context: Software performance is a critical non-functional requirement, appearing in many fields such as mission critical applications, financial, and real time systems. In this work we focused on early detection of performance bugs; our software under study was a real time system used in the advertisement/marketing domain. Goal: Find a simple and easy to implement solution, predicting performance bugs. Method: We built several models using four machine learning methods, commonly used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian Networks, and Logistic Regression. Results: Our empirical results show that a C4.5 model, using lines of code changed, file's age and size as explanatory variables, can be used to predict performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that reducing the number of changes delivered on a commit, can decrease the chance of performance bug injection. Conclusions: We believe that our approach can help practitioners to eliminate performance bugs early in the development cycle. Our results are also of interest to theoreticians, establishing a link between functional bugs and (non-functional) performance bugs, and explicitly showing that attributes used for prediction of functional bugs can be used for prediction of performance bugs

    A Landscape Perspective on Bird Beak Deformity: An Epizootic of Unknown Etiology

    Get PDF
    Although birds with beak deformities have been documented throughout the literature, the recent spike in occurrences in certain regions has caused concern in the scientific community. A major concern relates to the role of contaminants and environmental degradation in causing or exacerbating this epizootic. This study used spatial and statistical analyses to examine the problem from a landscape perspective. The objectives of this study were to 1) locate and compile a database of known bird beak occurrences, 2) conduct a preliminary assessment of the environmental correlates of this epizootic in order to identify patterns, and 3) make recommendations that could guide future research and data collection. Logistic regression models were generated using known occurrences of bird beak deformity as well as randomly generated points compared with spatial data on relevant environmental variables. Generalized linear models predicted high probability (p(deformity)=0.88) of deformity occurring when all environmental variables were present. With more collaboration among researchers and data sharing, this method could provide insight into the currently unknown etiology of bird beak deformity

    The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes

    Full text link
    Automatic testing is a widely adopted technique for improving software quality. Software developers add, remove and update test methods and test classes as part of the software development process as well as during the evolution phase, following the initial release. In this work we conduct a large scale study of 61 popular open source projects and report the relationships we have established between test maintenance, production code maintenance, and semantic changes (e.g, statement added, method removed, etc.). performed in developers' commits. We build predictive models, and show that the number of tests in a software project can be well predicted by employing code maintenance profiles (i.e., how many commits were performed in each of the maintenance activities: corrective, perfective, adaptive). Our findings also reveal that more often than not, developers perform code fixes without performing complementary test maintenance in the same commit (e.g., update an existing test or add a new one). When developers do perform test maintenance, it is likely to be affected by the semantic changes they perform as part of their commit. Our work is based on studying 61 popular open source projects, comprised of over 240,000 commits consisting of over 16,000,000 semantic change type instances, performed by over 4,000 software engineers.Comment: postprint, ICSME 201

    A Bayesian Network Approach to Estimating Software Reliability of RSG-GAS Reactor Protection System

    Get PDF
    Reliability represents one of the most important attributes of software quality. Assessing the reliability of software embedded in the safety of highlycritical systems is essential. Unfortunately, there are many factors influencing software reliability that cannot be measured directly. Furthermore, the existing models and approaches for assessing software reliability have assumptions and limitations which are not directly acceptable for all systems, such as reactor protection systems. This paper presents the result of a study which aims to conduct quantitative assessment of the software reliability at the reactor protection system (RPS) of RSG-GAS based on software development life cycle. A Bayesian network (BN) is applied in this research and used to predict the software defect in the operation which represents the software reliability. The availability of operation failure data, characteristics of the RPS components and their operation features, prior knowledge on the software development and system reliability, as well as relevant finding from references were considered in the assessment and the construction of nodes on causal network model. The structure of causal model consists of eight nodes including design quality, problem complexity, and defect inserted in the software. The calculation result using Agenarisk software revealed that software defect in the operation of RPS follows binomial statistic distribution with the mean of 1.393. This number indicated the high software maturity level and high capability of the organization. The improvement of software defect concentration range on the posterior distribution compared with the prior’s is also identified. The result achieved is valuable for furtherreliability estimation by introducingnew evidence and experience data, and by setting up an appropriate plan in order to enhance software reliability in the RPS

    Applications of Causality and Causal Inference in Software Engineering

    Full text link
    Causal inference is a study of causal relationships between events and the statistical study of inferring these relationships through interventions and other statistical techniques. Causal reasoning is any line of work toward determining causal relationships, including causal inference. This paper explores the relationship between causal reasoning and various fields of software engineering. This paper aims to uncover which software engineering fields are currently benefiting from the study of causal inference and causal reasoning, as well as which aspects of various problems are best addressed using this methodology. With this information, this paper also aims to find future subjects and fields that would benefit from this form of reasoning and to provide that information to future researchers. This paper follows a systematic literature review, including; the formulation of a search query, inclusion and exclusion criteria of the search results, clarifying questions answered by the found literature, and synthesizing the results from the literature review. Through close examination of the 45 found papers relevant to the research questions, it was revealed that the majority of causal reasoning as related to software engineering is related to testing through root cause localization. Furthermore, most causal reasoning is done informally through an exploratory process of forming a Causality Graph as opposed to strict statistical analysis or introduction of interventions. Finally, causal reasoning is also used as a justification for many tools intended to make the software more human-readable by providing additional causal information to logging processes or modeling languages
    • …
    corecore