Search CORE

89 research outputs found

Software defect prediction: do different classifiers find the same defects?

Author: AT Mısırlı
B Turhan
C Catal
C Seiffert
C Soares
D Gray
D Gray
David Bowes
DH Wolpert
E Arisholm
H Chen
I Witten
IH Laradji
Jean Petrić
K Elish
L Briand
L Madeyski
M D’Ambros
M Shepperd
M Shepperd
M Shepperd
MA Hall
N Fenton
NV Chawla
R Malhotra
S Lessmann
T Hall
T Khoshgoftaar
T Menzies
Tracy Hall
U Fayyad
W Chen
Y Zhou
Z Sun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio

Crossref

Springer - Publisher Connector

Lancaster E-Prints

University of Hertfordshire Research Archive

Analysis of Data mining based Software Defect Prediction Techniques

Author: Naheed Azeem
Shazia Usmani
Publication venue: Global Journals Inc. (US)
Publication date: 23/08/2011
Field of study

Software bug repository is the main resource for fault prone modules. Different data mining algorithms are used to extract fault prone modules from these repositories. Software development team tries to increase the software quality by decreasing the number of defects as much as possible. In this paper different data mining techniques are discussed for identifying fault prone modules as well as compare the data mining algorithms to find out the best algorithm for defect prediction

Global Journal of Computer Science and Technology (GJCST)

Absolute Correlation Weighted Naïve Bayes for Software Defect Prediction

Author: Asmono R. T. (Rizky)
Syukur A. (Abdul)
Wahono R. S. (Romi)
Publication venue: IlmuKomputer.com
Publication date: 01/01/2015
Field of study

The maintenance phase of the software project can be very expensive for the developer team and harmful to the users because some flawed software modules. It can be avoided by detecting defects as early as possible. Software defect prediction will provide an opportunity for the developer team to test modules or files that have a high probability defect. Naïve Bayes has been used to predict software defects. However, Naive Bayes assumes all attributes are equally important and are not related each other while, in fact, this assumption is not true in many cases. Absolute value of correlation coefficient has been proposed as weighting method to overcome Naïve Bayes assumptions. In this study, Absolute Correlation Weighted Naïve Bayes have been proposed. The results of parametric test on experiment results show that the proposed method improve the performance of Naïve Bayes for classifying defect-prone on software defect prediction

Neliti

Further thoughts on precision

Author: Bowes David
Christianson B.
Davey N.
Gray D.
Sun Yi
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2011
Field of study

Background: There has been much discussion amongst automated software defect prediction researchers regarding use of the precision and false positive rate classifier performance metrics. Aim: To demonstrate and explain why failing to report precision when using data with highly imbalanced class distributions may provide an overly optimistic view of classifier performance. Method: Well documented examples of how dependent class distribution affects the suitability of performance measures. Conclusions: When using data where the minority class represents less than around 5 to 10 percent of data points in total, failing to report precision may be a critical mistake. Furthermore, deriving the precision values omitted from studies can reveal valuable insight into true classifier performancePeer reviewedFinal Accepted Versio

CiteSeerX

Crossref

Lancaster E-Prints

University of Hertfordshire Research Archive

Quality Assessment and Prediction in Software Product Lines

Author: Devine Thomas Ryan
Publication venue: The Research Repository @ WVU
Publication date: 01/05/2013
Field of study

At the heart of product line development is the assumption that through structured reuse later products will be of a higher quality and require less time and effort to develop and test. This thesis presents empirical results from two case studies aimed at assessing the quality aspect of this claim and exploring fault prediction in the context of software product lines. The first case study examines pre-release faults and change proneness of four products in PolyFlow, a medium-sized, industrial software product line; the second case study analyzes post-release faults using pre-release data over seven releases of four products in Eclipse, a very large, open source software product line.;The goals of our research are (1) to determine the association between various software metrics, as well as their correlation with the number of faults at the component/package level; (2) to characterize the fault and change proneness of components/packages at various levels of reuse; (3) to explore the benefits of the structured reuse found in software product lines; and (4) to evaluate the effectiveness of predictive models, built on a variety of products in a software product line, to make accurate predictions of pre-release software faults (in the case of PolyFlow) and post-release software faults (in the case of Eclipse).;The research results of both studies confirm, in a software product line setting, the findings of others that faults (both pre- and post-release) are more highly correlated to change metrics than to static code metrics, and are mostly contained in a small set of components/ packages. The longitudinal aspect of our research indicates that new products do benefit from the development and testing of previous products. The results also indicate that pre-existing components/packages, including the common components/packages, undergo continuous change, but tend to sustain low fault densities. However, this is not always true for newly developed components/packages. Finally, the results also show that predictions of pre-release faults in the case of PolyFlow and post-release faults in the case of Eclipse can be done accurately from pre-release data, and furthermore, that these predictions benefit from information about additional products in the software product lines

The Research Repository @ WVU (West Virginia University)

Automation of Cellular Network Faults

Author: Johnson I. Agbinya
Okuthe P. Kogeda
Publication venue: 'IntechOpen'
Publication date: 26/04/2011
Field of study

IntechOpen

Recommended from our members

Software Defect Prediction: Do Different Classifiers Find the Same Defects?

Author: Hall T
Publication venue
Publication date: 01/01/2017
Field of study

During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Na¨ıve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction

Brunel University Research Archive

Java Method Calls in the Hierarchy � Uncovering Yet another Inheritance Foible

Author: Emal Nasseri
Steve Counsell
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2010
Field of study

Crossref