7,960 research outputs found
Predicting Software Defects with Causality Tests
In this paper, we propose a defect prediction approach centered on more robust evidences towards causality between source code metrics (as predictors) and the occurrence of defects. More specifically, we rely on the Granger Causality Test to evaluate whether past variations in source code metrics values can be used to forecast changes in a time series of defects. Our approach triggers alarms when changes made to the source code of a target system have a high chance of producing defects. We evaluated our approach in several life stages of four Java-based systems. We reached an average precision of 50% in three out of the four systems we evaluated. Moreover, by comparing our approach with baselines that are not based on causality tests, it achieved a better precision
BugMaps-Granger: a tool for visualizing and predicting bugs using Granger causality tests
International audienceBackgroundDespite the increasing number of bug analysis tools for exploring bugs in software systems, there are no tools supporting the investigation of causality relationships between internal quality metrics and bugs. In this paper, we propose an extension of the BugMaps tool called BugMaps-Granger that allows the analysis of source code properties that are more likely to cause bugs. For this purpose, we relied on the Granger Causality Test to evaluate whether past changes to a given time series of source code metrics can be used to forecast changes in a time series of defects. Our tool extracts source code versions from version control platforms, calculates source code metrics and defects time series, computes Granger Test results, and provides interactive visualizations for causal analysis of bugs.ResultsWe provide an example of use of BugMaps-Granger involving data from the Equinox Framework and Eclipse JDT Core systems collected during three years. For these systems, the tool was able to identify the modules with more bugs, the average lifetime and complexity of the bugs, and the source code properties that are more likely to cause bugs.ConclusionsWith the results provided by the tool in hand, a maintainer can perform at least two main software quality assurance activities: (a) refactoring the source code properties that Granger-caused bugs and (b) improving unit tests coverage in classes with more bugs
Uncovering Causal Relationships between Software Metrics and Bugs
International audienceBug prediction is an important challenge for software engineering research. It consist in looking for possible early indicators of the presence of bugs in a software. However, despite the relevance of the issue, most experiments designed to evaluate bug prediction only investigate whether there is a linear relation between the predictor and the presence of bugs. However, it is well known that standard regression models can not filter out spurious relations. Therefore, in this paper we describe an experiment to discover more robust evidences towards causality between software metrics (as predictors) and the occurrence of bugs. For this purpose, we have relied on Granger Causality Test to evaluate whether past changes in a given time series are useful to forecast changes in another series. As its name suggests, Granger Test is a better indication of causality between two variables. We present and discuss the results of experiments on four real world systems evaluated over a time frame of almost four years. Particularly, we have been able to discover in the history of metrics the causes - in the terms of the Granger Test - for 64% to 93% of the defects reported for the systems considered in our experiment
Towards Automated Performance Bug Identification in Python
Context: Software performance is a critical non-functional requirement,
appearing in many fields such as mission critical applications, financial, and
real time systems. In this work we focused on early detection of performance
bugs; our software under study was a real time system used in the
advertisement/marketing domain.
Goal: Find a simple and easy to implement solution, predicting performance
bugs.
Method: We built several models using four machine learning methods, commonly
used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian
Networks, and Logistic Regression.
Results: Our empirical results show that a C4.5 model, using lines of code
changed, file's age and size as explanatory variables, can be used to predict
performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that
reducing the number of changes delivered on a commit, can decrease the chance
of performance bug injection.
Conclusions: We believe that our approach can help practitioners to eliminate
performance bugs early in the development cycle. Our results are also of
interest to theoreticians, establishing a link between functional bugs and
(non-functional) performance bugs, and explicitly showing that attributes used
for prediction of functional bugs can be used for prediction of performance
bugs
A Landscape Perspective on Bird Beak Deformity: An Epizootic of Unknown Etiology
Although birds with beak deformities have been documented throughout the literature, the recent spike in occurrences in certain regions has caused concern in the scientific community. A major concern relates to the role of contaminants and environmental degradation in causing or exacerbating this epizootic. This study used spatial and statistical analyses to examine the problem from a landscape perspective. The objectives of this study were to 1) locate and compile a database of known bird beak occurrences, 2) conduct a preliminary assessment of the environmental correlates of this epizootic in order to identify patterns, and 3) make recommendations that could guide future research and data collection. Logistic regression models were generated using known occurrences of bird beak deformity as well as randomly generated points compared with spatial data on relevant environmental variables. Generalized linear models predicted high probability (p(deformity)=0.88) of deformity occurring when all environmental variables were present. With more collaboration among researchers and data sharing, this method could provide insight into the currently unknown etiology of bird beak deformity
The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes
Automatic testing is a widely adopted technique for improving software
quality. Software developers add, remove and update test methods and test
classes as part of the software development process as well as during the
evolution phase, following the initial release. In this work we conduct a large
scale study of 61 popular open source projects and report the relationships we
have established between test maintenance, production code maintenance, and
semantic changes (e.g, statement added, method removed, etc.). performed in
developers' commits.
We build predictive models, and show that the number of tests in a software
project can be well predicted by employing code maintenance profiles (i.e., how
many commits were performed in each of the maintenance activities: corrective,
perfective, adaptive). Our findings also reveal that more often than not,
developers perform code fixes without performing complementary test maintenance
in the same commit (e.g., update an existing test or add a new one). When
developers do perform test maintenance, it is likely to be affected by the
semantic changes they perform as part of their commit.
Our work is based on studying 61 popular open source projects, comprised of
over 240,000 commits consisting of over 16,000,000 semantic change type
instances, performed by over 4,000 software engineers.Comment: postprint, ICSME 201
A Bayesian Network Approach to Estimating Software Reliability of RSG-GAS Reactor Protection System
Reliability represents one of the most important attributes of software quality. Assessing the reliability of software embedded in the safety of highlycritical systems is essential. Unfortunately, there are many factors influencing software reliability that cannot be measured directly. Furthermore, the existing models and approaches for assessing software reliability have assumptions and limitations which are not directly acceptable for all systems, such as reactor protection systems. This paper presents the result of a study which aims to conduct quantitative assessment of the software reliability at the reactor protection system (RPS) of RSG-GAS based on software development life cycle. A Bayesian network (BN) is applied in this research and used to predict the software defect in the operation which represents the software reliability. The availability of operation failure data, characteristics of the RPS components and their operation features, prior knowledge on the software development and system reliability, as well as relevant finding from references were considered in the assessment and the construction of nodes on causal network model. The structure of causal model consists of eight nodes including design quality, problem complexity, and defect inserted in the software. The calculation result using Agenarisk software revealed that software defect in the operation of RPS follows binomial statistic distribution with the mean of 1.393. This number indicated the high software maturity level and high capability of the organization. The improvement of software defect concentration range on the posterior distribution compared with the prior’s is also identified. The result achieved is valuable for furtherreliability estimation by introducingnew evidence and experience data, and by setting up an appropriate plan in order to enhance software reliability in the RPS
Applications of Causality and Causal Inference in Software Engineering
Causal inference is a study of causal relationships between events and the
statistical study of inferring these relationships through interventions and
other statistical techniques. Causal reasoning is any line of work toward
determining causal relationships, including causal inference. This paper
explores the relationship between causal reasoning and various fields of
software engineering. This paper aims to uncover which software engineering
fields are currently benefiting from the study of causal inference and causal
reasoning, as well as which aspects of various problems are best addressed
using this methodology. With this information, this paper also aims to find
future subjects and fields that would benefit from this form of reasoning and
to provide that information to future researchers. This paper follows a
systematic literature review, including; the formulation of a search query,
inclusion and exclusion criteria of the search results, clarifying questions
answered by the found literature, and synthesizing the results from the
literature review. Through close examination of the 45 found papers relevant to
the research questions, it was revealed that the majority of causal reasoning
as related to software engineering is related to testing through root cause
localization. Furthermore, most causal reasoning is done informally through an
exploratory process of forming a Causality Graph as opposed to strict
statistical analysis or introduction of interventions. Finally, causal
reasoning is also used as a justification for many tools intended to make the
software more human-readable by providing additional causal information to
logging processes or modeling languages
- …