Search CORE

2,867 research outputs found

Support Vector Machines for Credit Scoring and discovery of significant features

Author: Baesens
Cristianini
Duda
Gayler
Guyon
Hand
Hand
Henley
Huang
Huang
Joachims
Jonathan Crook
Lee
Li
Schebesch
Thomas
Tony Bellotti
Van Gestel
Vapnik
Publication venue: 'Elsevier BV'
Publication date: 02/04/2008
Field of study

The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit scoring for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default. 1

CiteSeerX

Prediction of delayed graft function after kidney transplantation : comparison between logistic regression and machine learning methods

Author: Couckuyt Ivo
Decruyenaere Alexander
Decruyenaere Philippe
Dhaene Tom
Peeters Patrick
Vermassen Frank
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: Predictive models for delayed graft function (DGF) after kidney transplantation are usually developed using logistic regression. We want to evaluate the value of machine learning methods in the prediction of DGF. Methods: 497 kidney transplantations from deceased donors at the Ghent University Hospital between 2005 and 2011 are included. A feature elimination procedure is applied to determine the optimal number of features, resulting in 20 selected parameters (24 parameters after conversion to indicator parameters) out of 55 retrospectively collected parameters. Subsequently, 9 distinct types of predictive models are fitted using the reduced data set: logistic regression (LR), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVMs; using linear, radial basis function and polynomial kernels), decision tree (DT), random forest (RF), and stochastic gradient boosting (SGB). Performance of the models is assessed by computing sensitivity, positive predictive values and area under the receiver operating characteristic curve (AUROC) after 10-fold stratified cross-validation. AUROCs of the models are pairwise compared using Wilcoxon signed-rank test. Results: The observed incidence of DGF is 12.5 %. DT is not able to discriminate between recipients with and without DGF (AUROC of 52.5 %) and is inferior to the other methods. SGB, RF and polynomial SVM are mainly able to identify recipients without DGF (AUROC of 77.2, 73.9 and 79.8 %, respectively) and only outperform DT. LDA, QDA, radial SVM and LR also have the ability to identify recipients with DGF, resulting in higher discriminative capacity (AUROC of 82.2, 79.6, 83.3 and 81.7 %, respectively), which outperforms DT and RF. Linear SVM has the highest discriminative capacity (AUROC of 84.3 %), outperforming each method, except for radial SVM, polynomial SVM and LDA. However, it is the only method superior to LR. Conclusions: The discriminative capacities of LDA, linear SVM, radial SVM and LR are the only ones above 80 %. None of the pairwise AUROC comparisons between these models is statistically significant, except linear SVM outperforming LR. Additionally, the sensitivity of linear SVM to identify recipients with DGF is amongst the three highest of all models. Due to both reasons, the authors believe that linear SVM is most appropriate to predict DGF

Springer - Publisher Connector

CREDIT SCORING USING LOGISTIC REGRESSION

Author: Mathew Ansen
Publication venue: SJSU ScholarWorks
Publication date: 25/05/2017
Field of study

This report presents an approach to predict the credit scores of customers using the Logistic Regression machine learning algorithm. The research objective of this project is to perform a comparative study between feature selection and feature extraction, against the same dataset using the Logistic Regression machine learning algorithm. For feature selection, we have used Stepwise Logistic Regression. For feature extraction, we have used Singular Value Decomposition (SVD) and Weighted Singular Value Decomposition (SVD). In order to test the accuracy obtained using feature selection and feature extraction, we used a public credit dataset having 11 features and 150,000 records. After performing feature reduction, Logistic Regression algorithm was used for classification. In our results, we observed that Stepwise Logistic Regression gave a 14% increase in accuracy as compared to Singular Value Decomposition (SVD) and a 10% increase in accuracy as compared to Weighted Singular Value Decomposition (SVD). Thus, we can conclude that Stepwise Logistic Regression performed significantly better than both Singular Value Decomposition (SVD) and Weighted Singular Value Decomposition (SVD). The benefit of using feature selection was that it helped us in identifying important features, which improved the prediction accuracy of the classifier

SJSU ScholarWorks

Modeling Financial Time Series with Artificial Neural Networks

Author: Versace Massimiliano
Wong Charles
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 15/12/2009
Field of study

Financial time series convey the decisions and actions of a population of human actors over time. Econometric and regressive models have been developed in the past decades for analyzing these time series. More recently, biologically inspired artificial neural network models have been shown to overcome some of the main challenges of traditional techniques by better exploiting the non-linear, non-stationary, and oscillatory nature of noisy, chaotic human interactions. This review paper explores the options, benefits, and weaknesses of the various forms of artificial neural networks as compared with regression techniques in the field of financial time series analysis.CELEST, a National Science Foundation Science of Learning Center (SBE-0354378); SyNAPSE program of the Defense Advanced Research Project Agency (HR001109-03-0001

Boston University Institutional Repository (OpenBU)

A Look at Modern Classification Algorithms for Structural Data

Author: Suther Frank Daniel
Suther Freddy Christoffer
Publication venue
Publication date: 01/01/2018
Field of study