Search CORE

2 research outputs found

A Review Of Training Data Selection In Software Defect Prediction

Author: Abal Abas Zuraida
Ahmad Sabrina
Sinaga Benyamin Langgu
Publication venue: Little Lion Scientific Islamabad Pakistan
Publication date: 01/06/2020
Field of study

The publicly available dataset poses a challenge in selecting the suitable data to train a defect prediction model to predict defect on other projects. Using a cross-project training dataset without a careful selection will degrade the defect prediction performance. Consequently, training data selection is an essential step to develop a defect prediction model. This paper aims to synthesize the state-of-the-art for training data selection methods published from 2009 to 2019. The existing approaches addressing the training data selection issue fall into three groups, which are nearest neighbour, cluster-based, and evolutionary method. According to the results in the literature, the cluster-based method tends to outperform the nearest neighbour method. On the other hand, the research on evolutionary techniques gives promising results but is still scarce. Therefore, the review concludes that there is still some open area for further investigation in training data selection. We also present research direction within this are

Cross-company defect prediction via semi-supervised clustering-based data filtering and MSTrA-based transfer learning

Author: B Turhan
B Turhan
Chuanxiang Ma
D Ryu
Ezgi Erturk
Frank Wilcoxon
IH Laradji
K Fukunaga
K Jain
KO Elish
Kwabena Ebo Bennin
L Breiman
L Chen
L Peng
LC Briand
M Shepperd
Man Wu
Mandi Fu
MJ Siers
N Seliya
NV Chawla
PP Diego Mesquita
Q Song
Ruchika Malhotra
T Hall
V By Kampenes
V Vashisht
Xiao Yu
Y Ma
Yiheng Jian
Z Sun
Ömer Faruk Arar
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study