research

The non-ignorable missing-data problem in consumer banking

Abstract

The thesis aims to solve a specific missing-data problem in consumer banking. Application scoring and behaviour scoring are two of the main applications f 'statistics and probability modelling in consumer banking. In application scoring, a missing data problem occurs due to the selection of applICants by the bank. This has attracted much interest, and relevant discussion can be found under the topic of 'reject inference'. On the contrary, a similar problem in behaviour scoring has not been widely explored. The problem we wish to solve in the present thesis is a missing data problem that results from selection in behaviour scoring. We review the nature of the missing data problem and the existing solutions. Missingdata problems can be categorised into: MCAR , MAR, and MNAR problems. MCAR and MAR problems have attracted much attention; less discussion can be found on the MNAR problems. The problem we solve in this thesis is a MNAR problem. Two of the best known solutions to MNAR problems are: the two-step method proposed by Heckman, and the EM algorithm proposed by Little and Rubin. We illustrate how these existing methods can be extended to solve our problem. The extensions of these existing methods are constrained by an inflexible assumption, Le. each method assumes that an unrecorded variable has a specific distribution. We introduce solutions that remove this constraint so as to be able to use the empirical distribution. The thesis also presents solutions making use of updated MAR data, which are available in the case of behaviour scoring.Imperial Users onl

    Similar works