Search CORE

5,428 research outputs found

Software defect prediction: do different classifiers find the same defects?

Author: AT Mısırlı
B Turhan
C Catal
C Seiffert
C Soares
D Gray
D Gray
David Bowes
DH Wolpert
E Arisholm
H Chen
I Witten
IH Laradji
Jean Petrić
K Elish
L Briand
L Madeyski
M D’Ambros
M Shepperd
M Shepperd
M Shepperd
MA Hall
N Fenton
NV Chawla
R Malhotra
S Lessmann
T Hall
T Khoshgoftaar
T Menzies
Tracy Hall
U Fayyad
W Chen
Y Zhou
Z Sun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio

Crossref

Springer - Publisher Connector

Lancaster E-Prints

University of Hertfordshire Research Archive

Software defect prediction based on association rule classification.

Author: Baesens Bart
Baojun Ma
Dejaeger Karel
Vanthienen Jan
Publication venue
Publication date
Field of study

In software defect prediction, predictive models are estimated based on various code attributes to assess the likelihood of software modules containing errors. Many classification methods have been suggested to accomplish this task. However, association based classification methods have not been investigated so far in this context. This paper assesses the use of such a classification method, CBA2, and compares it to other rule based classification methods. Furthermore, we investigate whether rule sets generated on data from one software project can be used to predict defective software modules in other, similar software projects. It is found that applying the CBA2 algorithm results in both accurate and comprehensible rule sets.Software defect prediction; Association rule classification; CBA2; AUC;

Research Papers in Economics

A Novel Developed Supervised Machine Learning System For Classification And Prediction of Software Faults Using NASA Dataset

Author: Gupta Nikita
Sinha Ripu Ranjan
Publication venue: Auricle Global Society of Education and Research
Publication date: 07/10/2023
Field of study

The software systems of modern computers are extremely complex and versatile. Therefore, it is essential to regularly detect and correct software design faults. In order to devote resources effectively towards the creation of trustworthy software, software companies are increasingly engaging in the practise of predicting fault-prone modules in advance of testing. These software fault prediction methods rely on the thoroughness with which prior software versions' fault as well as related code has been retrievedTime, energy, and money are all saved as a result. Increases the company's initial success and bottom line greatly by satisfying its clientele. Numerous academics have poured into this area throughout the years in an effort to raise the bar for all software. Nowadays, The most often used approaches in this field are those based on machine learning (ML). The field of ML seeks to perfect software capable of evolving as well as adapting in response to fresh data. This paper introduces a fresh approach for doing ML by bringing together a number of different expert systems. In order to reach agreement on which aspects of a software system need to be tested, the proposed multi-classifier model pools the strengths of the most effective classifiers. Several top-performing classifiers for defect prediction are put through their paces in an experiential evaluation. We test our method on 16 publicly available datasets from the NASA Metric Data Programme (MDP) repository at the promise repository. Parameters of confusion, recall, precision, recognition accuracy, etc., are evaluated and contrasted with existing schemes in a software analysis performed with the help of the python simulation tool with findings. The experimental outcomes demonstrate that by combining LGBM, XGBoost, and Voting classifiers, using a multi classifier approach, we are capable to significantly improve software fault prediction performance. The results of the investigation show that the suggested method will lead to better practical outcomes in the prediction of device failures

International Journal on Recent and Innovation Trends in Computing and Communication

A FRAMEWORK FOR SOFTWARE RELIABILITY MANAGEMENT BASED ON THE SOFTWARE DEVELOPMENT PROFILE MODEL

Author: Khoshkhou Arya
Publication venue
Publication date: 01/01/2011
Field of study

Recent empirical studies of software have shown a strong correlation between change history of files and their fault-proneness. Statistical data analysis techniques, such as regression analysis, have been applied to validate this finding. While these regression-based models show a correlation between selected software attributes and defect-proneness, in most cases, they are inadequate in terms of demonstrating causality. For this reason, we introduce the Software Development Profile Model (SDPM) as a causal model for identifying defect-prone software artifacts based on their change history and software development activities. The SDPM is based on the assumption that human error during software development is the sole cause for defects leading to software failures. The SDPM assumes that when a software construct is touched, it has a chance to become defective. Software development activities such as inspection, testing, and rework further affect the remaining number of software defects. Under this assumption, the SDPM estimates the defect content of software artifacts based on software change history and software development activities. SDPM is an improvement over existing defect estimation models because it not only uses evidence from current project to estimate defect content, it also allows software managers to manage software projects quantitatively by making risk informed decisions early in software development life cycle. We apply the SDPM in several real life software development projects, showing how it is used and analyzing its accuracy in predicting defect-prone files and compare the results with the Poisson regression model

Digital Repository at the University of Maryland