Supervised and unsupervised data mining approaches in loan default prediction

Alejandrino, Jovanne C.; Murcia, John Vianne Bauya; P. Bolacoy, Jovito Jr.

Supervised and unsupervised data mining approaches in loan default prediction

Authors: Jovanne C. Alejandrino
John Vianne Bauya Murcia
Jovito Jr. P. Bolacoy
Publication date: 1 April 2023
Publisher: 'Institute of Advanced Engineering and Science'
Doi

Abstract

Given the paramount importance of data mining in organizations and the possible contribution of a data-driven customer classification recommender systems for loan-extending financial institutions, the study applied supervised and supervised data mining approaches to derive the best classifier of loan default. A total of 900 instances with determined attributes and class labels were used for the training and cross-validation processes while prediction used 100 new instances without class labels. In the training phase, J48 with confidence factor of 50% attained the highest classification accuracy (76.85%), k-nearest neighbors (k-NN) 3 the highest (78.38%) in IBk variants, naïve Bayes has a classification accuracy of 76.65%, and logistic has 77.31% classification accuracy. k-NN 3 and logistic have the highest classification accuracy, F-measures, and kappa statistics. Implementation of these algorithms to the test set yielded 48 non-defaulters and 52 defaulters for k -NN 3 while 44 non-defaulters and 56 defaulters under logistic. Implications were discussed in the paper

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

ZENODO

oai:zenodo.org:7690100

Last time updated on 08/08/2023

Institute of Advanced Engineering and Science

oai:ojs.www.iaescore.com:artic...

Last time updated on 30/11/2022