Does segmentation always improve model performance in credit scoring?

Anderson; Armitage; Aurifeille; Banasik; Berk; Breiman; Chan; Desmet; Greene; Hand; Hand; Hawkins; Hosmer; Izenman; Kass; Katarzyna Bijak; Landwehr; Larose; Loh; Lyn C. Thomas; Makuch; Mays; Siddiqi; Thomas; Thomas; Van Gestel; Wedel; Yobas

research

Does segmentation always improve model performance in credit scoring?

Authors: Anderson
Armitage
Aurifeille
Banasik
Berk
Breiman
Chan
Desmet
Greene
Hand
Hand
Hawkins
Hosmer
Izenman
Kass
Katarzyna Bijak
Landwehr
Larose
Loh
Lyn C. Thomas
Makuch
Mays
Siddiqi
Thomas
Thomas
Van Gestel
Wedel
Yobas
Publication date: 1 February 2012
Publisher: 'Elsevier BV'
Doi

Abstract

Credit scoring allows for the credit risk assessment of bank customers. A single scoring model (scorecard) can be developed for the entire customer population, e.g. using logistic regression. However, it is often expected that segmentation, i.e. dividing the population into several groups and building separate scorecards for them, will improve the model performance. The most common statistical methods for segmentation are the two-step approaches, where logistic regression follows Classification and Regression Trees (CART) or Chi-squared Automatic Interaction Detection (CHAID) trees etc. In this research, the two-step approaches are applied as well as a new, simultaneous method, in which both segmentation and scorecards are optimised at the same time: Logistic Trees with Unbiased Selection (LOTUS). For reference purposes, a single-scorecard model is used. The above-mentioned methods are applied to the data provided by two of the major UK banks and one of the European credit bureaus. The model performance measures are then compared to examine whether there is improvement due to the segmentation methods used. It is found that segmentation does not always improve model performance in credit scoring: for none of the analysed real-world datasets, the multi-scorecard models perform considerably better than the single-scorecard ones. Moreover, in this application, there is no difference in performance between the two-step and simultaneous approache

Similar works

Full text

Available Versions

Crossref

info:doi/10.1016%2Fj.eswa.2011...

Last time updated on 01/04/2019

Southampton (e-Prints Soton)

oai:eprints.soton.ac.uk:208555

Last time updated on 05/04/2012