8,943 research outputs found
Software defect prediction using Bayesian networks
There are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian networks to determine the probabilistic influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, we define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We extract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, we learn the marginal defect proneness probability of the whole software system, the set of most effective metrics, and the influential relationships among metrics and defectiveness. Our experiments on nine open source Promise data repository data sets show that response for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most effective metrics whereas coupling between objects (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less effective metrics on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are untrustworthy. On the other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, we observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness. However, further investigation involving a greater number of projects is needed to confirm our findings.Publisher's VersionAuthor Pre-Prin
Towards Automated Performance Bug Identification in Python
Context: Software performance is a critical non-functional requirement,
appearing in many fields such as mission critical applications, financial, and
real time systems. In this work we focused on early detection of performance
bugs; our software under study was a real time system used in the
advertisement/marketing domain.
Goal: Find a simple and easy to implement solution, predicting performance
bugs.
Method: We built several models using four machine learning methods, commonly
used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian
Networks, and Logistic Regression.
Results: Our empirical results show that a C4.5 model, using lines of code
changed, file's age and size as explanatory variables, can be used to predict
performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that
reducing the number of changes delivered on a commit, can decrease the chance
of performance bug injection.
Conclusions: We believe that our approach can help practitioners to eliminate
performance bugs early in the development cycle. Our results are also of
interest to theoreticians, establishing a link between functional bugs and
(non-functional) performance bugs, and explicitly showing that attributes used
for prediction of functional bugs can be used for prediction of performance
bugs
Supporting Defect Causal Analysis in Practice with Cross-Company Data on Causes of Requirements Engineering Problems
[Context] Defect Causal Analysis (DCA) represents an efficient practice to
improve software processes. While knowledge on cause-effect relations is
helpful to support DCA, collecting cause-effect data may require significant
effort and time. [Goal] We propose and evaluate a new DCA approach that uses
cross-company data to support the practical application of DCA. [Method] We
collected cross-company data on causes of requirements engineering problems
from 74 Brazilian organizations and built a Bayesian network. Our DCA approach
uses the diagnostic inference of the Bayesian network to support DCA sessions.
We evaluated our approach by applying a model for technology transfer to
industry and conducted three consecutive evaluations: (i) in academia, (ii)
with industry representatives of the Fraunhofer Project Center at UFBA, and
(iii) in an industrial case study at the Brazilian National Development Bank
(BNDES). [Results] We received positive feedback in all three evaluations and
the cross-company data was considered helpful for determining main causes.
[Conclusions] Our results strengthen our confidence in that supporting DCA with
cross-company data is promising and should be further investigated.Comment: 10 pages, 8 figures, accepted for the 39th International Conference
on Software Engineering (ICSE'17
- âŠ