4,706 research outputs found
Feature selection in credit risk modeling: an international evidence
This paper aims to discover a suitable combination of contemporary feature selection techniques and robust prediction classifiers.
As such, to examine the impact of the feature selection method
on classifier performance, we use two Chinese and three other
real-world credit scoring datasets. The utilized feature selection
methods are the least absolute shrinkage and selection operator
(LASSO), multivariate adaptive regression splines (MARS). In contrast, the examined classifiers are the classification and regression
trees (CART), logistic regression (LR), artificial neural network
(ANN), and support vector machines (SVM). Empirical findings
confirm that LASSO’s feature selection method, followed by
robust classifier SVM, demonstrates remarkable improvement and
outperforms other competitive classifiers. Moreover, ANN also
offers improved accuracy with feature selection methods; LR only
can improve classification efficiency through performing feature
selection via LASSO. Nonetheless, CART does not provide any
indication of improvement in any combination. The proposed
credit scoring modeling strategy may use to develop policy, progressive ideas, operational guidelines for effective credit risk management of lending, and other financial institutions. The finding
of this study has practical value, as to date, there is no consensus
about the combination of feature selection method and prediction classifiers
Customer Churn Prediction
Churned customers identification plays an essential role for the functioning and growth of any business. Identification of churned customers can help the business to know the reasons for the churn and they can plan their market strategies accordingly to enhance the growth of a business. This research is aimed at developing a machine learning model that can precisely predict the churned customers from the total customers of a Credit Union financial institution. A quantitative and deductive research strategies are employed to build a supervised machine learning model that addresses the class imbalance problem handled feature selection and efficiently predict the customer churn. The overall accuracy of the model, Receiver Operating Characteristic curve and Area Under the Receiver Operating Characteristic Curve is used as the evaluation metrics for this research to identify the best classifier. A comparative study on the most popular supervised machine learning methods – Logistic Regression, Random Forest, Support Vector Machine (SVM) and Neural Network were applied to customer churning prediction in a CU context. In the first phase of our experiments, the various feature selection techniques were studied. In the second phase of our study, all models were applied on the imbalance dataset and results were evaluated. SMOTE technique is used to balance the data and then the same models were applied on the balanced dataset and results were evaluated and compared. The best over-all classifier was Random Forest with accuracy almost 97%, precision 91% and recall as 98%
- …