Search CORE

2,046 research outputs found

Support Vector Machines for Credit Scoring and discovery of significant features

Author: Baesens
Cristianini
Duda
Gayler
Guyon
Hand
Hand
Henley
Huang
Huang
Joachims
Jonathan Crook
Lee
Li
Schebesch
Thomas
Tony Bellotti
Van Gestel
Vapnik
Publication venue: 'Elsevier BV'
Publication date: 02/04/2008
Field of study

The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit scoring for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default. 1

CiteSeerX

One-Class Classification: Taxonomy of Study and Review of Techniques

Author: Khan Shehroz S.
Madden Michael G.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 29/11/2013
Field of study

One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

arXiv.org e-Print Archive

Access to Research at National University of Ireland, Galway

Statistical aspects of credit scoring

Author: Henley William Edward
Publication venue
Publication date: 01/01/1995
Field of study

This thesis is concerned with statistical aspects of credit scoring, the process of determining how likely an applicant for credit is to default with repayments. In Chapters 1-4 a detailed introduction to credit scoring methodology is presented, including evaluation of previous published work on credit scoring and a review of discrimination and classification techniques. In Chapter 5 we describe different approaches to measuring the absolute and relative performance of credit scoring models. Two significance tests are proposed for comparing the bad rate amongst the accepts (or the error rate) from two classifiers. In Chapter 6 we consider different approaches to reject inference, the procedure of allocating class membership probabilities to the rejects. One reason for needing reject inference is to reduce the sample selection bias that results from using a sample consisting only of accepted applicants to build new scorecards. We show that the characteristic vectors for the rejects do not contain information about the parameters of the observed data likelihood, unless extra information or assumptions are included. Methods of reject inference which incorporate additional information are proposed. In Chapter 7 we make comparisons of a range of different parametric and nonparametric classification techniques for credit scoring: linear regression, logistic regression, projection pursuit regression, Poisson regression, decision trees and decision graphs. We conclude that classifier performance is fairly insensitive to the particular technique adopted. In Chapter 8 we describe the application of the k-NN method to credit scoring. We propose using an adjusted version of the Eucidean distance metric, which is designed to incorporate knowledge of class separation contained in the data. We evaluate properties of the k-NN classifier through empirical studies and make comparisons with existing techniques