Search CORE

39 research outputs found

Software defect prediction based on association rule classification.

Author: Baesens Bart
Baojun Ma
Dejaeger Karel
Vanthienen Jan
Publication venue
Publication date
Field of study

In software defect prediction, predictive models are estimated based on various code attributes to assess the likelihood of software modules containing errors. Many classification methods have been suggested to accomplish this task. However, association based classification methods have not been investigated so far in this context. This paper assesses the use of such a classification method, CBA2, and compares it to other rule based classification methods. Furthermore, we investigate whether rule sets generated on data from one software project can be used to predict defective software modules in other, similar software projects. It is found that applying the CBA2 algorithm results in both accurate and comprehensible rule sets.Software defect prediction; Association rule classification; CBA2; AUC;

Research Papers in Economics

Predicting Fault-prone Software Module Using Data Mining Technique and Fuzzy Logic

Author: Goyal Neeraj Kumar
Pandey Ajeet Kumar
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 22/08/2020
Field of study

This paper discusses a new model towards reliability and quality improvement of software systems by predicting fault-prone module before testing. Model utilizes the classification capability of data mining techniques and knowledge stored in software metrics to classify the software module as fault-prone or not fault-prone. A decision tree is constructed using ID3 algorithm for existing project data in order to gain information for the purpose of decision making whether a particular module id fault-prone or not. The gained information is converted into fuzzy rules and integrated with fuzzy inference system to predict fault-prone or not fault-prone software module for target data. The model is also able to predict fault-proneness degree of faulty module. The goal is to help software manager to concentrate their testing efforts to fault-prone modules in order to improve the reliability and quality of the software system. We used NASA projects data set from the PROMOSE repository to validate the predictive accuracy of the model

Interscience Research Network

Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)

Author: Brown André EX
Ch'ng Quee-Lim
Currie Michael
Grundy Laura J
Hokanson Jim
Javer Avelino
Kerr Rex
Lee Chee Wai
Li Chris
Li Kezhi
Schafer William R
Yemini Eviatar
Publication venue
Publication date: 20/02/2018
Field of study

We report and fix an important systematic error in prior studies that ranked classifiers for software analytics. Those studies did not (a) assess classifiers on multiple criteria and they did not (b) study how variations in the data affect the results. Hence, this paper applies (a) multi-criteria tests while (b) fixing the weaker regions of the training data (using SMOTUNED, which is a self-tuning version of SMOTE). This approach leads to dramatically large increases in software defect predictions. When applied in a 5*5 cross-validation study for 3,681 JAVA classes (containing over a million lines of code) from open source systems, SMOTUNED increased AUC and recall by 60% and 20% respectively. These improvements are independent of the classifier used to predict for quality. Same kind of pattern (improvement) was observed when a comparative analysis of SMOTE and SMOTUNED was done against the most recent class imbalance technique. In conclusion, for software analytic tasks like defect prediction, (1) data pre-processing can be more important than classifier choice, (2) ranking studies are incomplete without such pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.Comment: 10 pages + 2 references. Accepted to International Conference of Software Engineering (ICSE), 201

arXiv.org e-Print Archive

ZENODO

FigShare

Less is more: Temporal fault predictive performance over multiple Hadoop releases

Author: Harman Mark
Harman Mark
Islam S.
Islam S.
Jia Yue
Jia Yue
Minku Leandro L.
Minku Leandro L.
Sarro Federica
Sarro Federica
Srivisut Komsan
Srivisut Komsan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We investigate search based fault prediction over time based on 8 consecutive Hadoop versions, aiming to analyse the impact of chronology on fault prediction performance. Our results confound the assumption, implicit in previous work, that additional information from historical versions improves prediction; though G-mean tends to improve, Recall can be reduced

UEL Research Repository at University of East London

CiteSeerX

Crossref

University of Birmingham Research Portal

Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)

Author: Bennin Kwabena Ebo
Chiha I.
Ghotra Baljinder
Menzies Tim
Omran M.
Pedregosa Fabian
Refaeilzadeh Payam
Tan Ming
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/02/2018
Field of study

arXiv.org e-Print Archive

Crossref

Software Defect Prediction Based on Classication Rule Mining

Author: Sahana Dulal Chandra
Publication venue
Publication date: 01/01/2013
Field of study

There has been rapid growth of software development. Due to various causes, the software comes with many defects. In Software development process, testing of software is the main phase which reduces the defects of the software. If a developer or a tester can predict the software defects properly then, it reduces the cost, time and eort. In this paper, we show a comparative analysis of software defect prediction based on classifcation rule mining. We propose a scheme for this process and we choose different classication algorithms. Showing the comparison of predictions in software defects analysis. This evaluation analyzes the prediction performance of competing learning schemes for given historical data sets(NASA MDP Data Set). The result of this scheme evaluation shows that we have to choose different classifer rule for different data set

ethesis@nitr

Predicting Test Case Verdicts Using TextualAnalysis of Commited Code Churns

Author: Al Sabbagh Khaled
Hebig Regina
Meding Wilhelm
Staron Miroslaw
Publication venue
Publication date: 01/01/2019
Field of study

Background: Continuous Integration (CI) is an agile software development practice that involves producing several clean builds of the software per day. The creation of these builds involve running excessive executions of automated tests, which is hampered by high hardware cost and reduced development velocity. Goal: The goal of our research is to develop a method that reduces the number of executed test cases at each CI cycle.Method: We adopt a design research approach with an infrastructure provider company to develop a method that exploits Ma-chine Learning (ML) to predict test case verdicts for committed sourcecode. We train five different ML models on two data sets and evaluate their performance using two simple retrieval measures: precision and recall. Results: While the results from training the ML models on the first data-set of test executions revealed low performance, the curated data-set for training showed an improvement on performance with respect to precision and recall. Conclusion: Our results indicate that the method is applicable when training the ML model on churns of small size

Chalmers Research

Adaptive Genetic Algorithm Based Artificial Neural Network for Software Defect Prediction

Author: Prof. Bachala Sathyanarayana
Racharla Suresh Kumar
Publication venue: Global Journals Inc. (US)
Publication date: 04/11/2015
Field of study

To meet the requirement of an efficient software defect prediction,in this paper an evolutionary computing based neural network learning scheme has been developed that alleviates the existing Artificial Neural Network (ANN) limitations such as local minima and convergence issues. To achieve optimal software defect prediction, in this paper, Adaptive-Genetic Algorithm (A-GA) based ANN learning and weightestimation scheme has been developed. Unlike conventional GA, in this paper we have used adaptive crossover and mutation probability parameter that alleviates the issue of disruption towards optimal solution. We have used object oriented software metrics, CK metrics for fault prediction and the proposed Evolutionary Computing Based Hybrid Neural Network (HENN)algorithm has been examined for performance in terms of accuracy, precision, recall, F-measure, completeness etc, where it has performed better as compared to major existing schemes. The proposed scheme exhibited 97.99% prediction accuracy while ensuring optimal precision, Fmeasure and recall

Global Journal of Computer Science and Technology (GJCST)