500 research outputs found

    Advanced survival modelling for consumer credit risk assessment: addressing recurrent events, multiple outcomes and frailty

    Get PDF
    A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Statistics and EconometricsThis thesis worked on the application of advanced survival models in consumer credit risk assessment, particularly to address issues of recurrent delinquency (or default) and recovery (cure) events as well as multiple risk events and frailty. Each chapter (2 to 5) addressed a separate problem and several key conclusions were reached. Chapter 2 addressed the neglected area of modelling recovery from delinquency to normal performance on retail consumer loans taking into account the recurrent nature of delinquency and also including time-dependent macroeconomic variables. Using data from a lending company in Zimbabwe, we provided a comprehensive analysis of the recovery patterns using the extended Cox model. The findings vividly showed that behavioural variables were the most important in understanding recovery patterns of obligors. This confirms and underscores the importance of using behavioural models to understand the recovery patterns of obligors in order to prevent credit loss. The findings also strongly revealed that the falling real gross domestic product, representing a deteriorating economic situation significantly explained the diminishing rate of recovery from delinquency to normal performance among consumers. The study pointed to the urgent need for policy measures aimed at promoting economic growth for the stabilisation of consumer welfare and the financial system at large.Chapter 3 extends the work in chapter 2 and notes that, even though multiple failure-time data are ubiquitous in finance and economics especially in the credit risk domain, it is unfortunate that naive statistical techniques which ignore the subsequent events are commonly used to analyse such data. Applying standard statistical methods without addressing the recurrence of the events produces biased and inefficient estimates, thus offering erroneous predictions. We explore various ways of modelling and forecasting recurrent delinquency and recovery events on consumer loans. Using consumer loans data from a severely distressed economic environment, we illustrate and empirically compare extended Cox models for ordered recurrent recovery events. We highlight that accounting for multiple events proffers detailed information, thus providing a nuanced understanding of the recovery prognosis of delinquents. For ordered indistinguishable recurrent recovery events, we recommend using the Andersen and Gill (1982) model since it fits these assumptions and performs well on predicting recovery.Chapter 4 extends chapters 2 and 3 and highlight that rigorous credit risk analysis is not only of significance to lenders and banks but is also of paramount importance for sound regulatory and economic policy making. Increasing loan impairment or delinquency, defaults and mortgage foreclosures signals a sick economy and generates considerable financial stability concerns. For lenders and banks, the accurate estimation of credit risk parameters remains essential for pricing, profit testing, capital provisioning as well as for managing delinquents. Traditional credit scoring models such as the logit regression only provide estimates of the lifetime probability of default for a loan but cannot identify the existence of cures and or other movements. These methods lack the ability to characterise the progression of borrowers over time and cannot utilise all the available data to understand the recurrence of risk events and possible occurrence of multiple loan outcomes. In this paper, we propose a system-wide multi-state framework to jointly model state occupations and the transitions between normal performance (current), delinquency, prepayment, repurchase, short sale and foreclosure on mortgage loans. The probability of loans transitioning to and from the various states is estimated in a discrete-time multi-state Markov model with seven allowable states and sixteen possible transitions. Additionally, we investigate the relationship between the probability of loans transitioning to and from various loan outcomes and loan-level covariates. We empirically test the performance of the model using the US single-family mortgage loans originated during the first quarter of 2009 and were followed on their monthly repayment performance until the third quarter of 2016. Our results show that the main factors affecting the transition into various loan outcomes are affordability as measured by debt-to-income ratio, equity as marked by loan-to-value ratio, interest rates and the property type. In chapter 5, we note that there has been increasing availability of consumer credit in Zimbabwe, yet the credit information sharing systems are not as advanced. Using frailty survival models on credit bureau data from Zimbabwe, the study investigates the possible underestimation of credit losses under the assumption of independence of default event times. The study found that adding a frailty term significantly improved the models, thus indicating the presence of unobserved heterogeneity. The major policy recommendation is for the regulator to institute appropriate policy frameworks to allow robust and complete credit information sharing and reporting as doing so will significantly improve the functioning of the credit market

    Forecasting Loan Default in Europe with Machine Learning

    Get PDF
    We use a dataset of 12 million residential mortgages to investigate the loan default behavior in several European countries. We model the default occurrence as a function of borrower characteristics, loan-specific variables, and local economic conditions. We compare the performance of a set of machine learning algorithms relative to the logistic regression, finding that they perform significantly better in providing predictions. The most important variables in explaining loan default are the interest rate and the local economic characteristics. The existence of relevant geographical heterogeneity in the variable importance points at the need for regionally tailored risk-assessment policies in Europe

    A dynamic credit scoring model based on survival gradient boosting decision tree approach

    Get PDF
    Credit scoring, which is typically transformed into a classification problem, is a powerful tool to manage credit risk since it forecasts the probability of default (PD) of a loan application. However, there is a growing trend of integrating survival analysis into credit scoring to provide a dynamic prediction on PD over time and a clear explanation on censoring. A novel dynamic credit scoring model (i.e., SurvXGBoost) is proposed based on survival gradient boosting decision tree (GBDT) approach. Our proposal, which combines survival analysis and GBDT approach, is expected to enhance predictability relative to statistical survival models. The proposed method is compared with several common benchmark models on a real-world consumer loan dataset. The results of out-of-sample and out-of-time validation indicate that SurvXGBoost outperform the benchmarks in terms of predictability and misclassification cost. The incorporation of macroeconomic variables can further enhance performance of survival models. The proposed SurvXGBoost meanwhile maintains some interpretability since it provides information on feature importance. First published online 14 December 202

    Quantitative Models for Prudential Credit Risk Management

    Get PDF
    The thesis investigates the exogenous maturity vintage model (EMV) as a framework for achieving unification in consumer credit risk analysis. We explore how the EMV model can be used in origination modelling, impairment analysis, capital analysis, stress-testing and in the assessment of economic value. The thesis is segmented into five themes. The first theme addresses some of the theoretical challenges of the standard EMV model – namely, the identifiability problem and the forecasting of the components of the model in predictive applications. We extend the model beyond the three time dimensions by introducing a behavioural dimension. This allows the model to produce loan-specific estimates of default risk. By replacing the vintage component with either an application risk or a behavioural risk dimension, the model resolves the identifiability problem inherent in the standard model. We show that the same model can be used interchangeably to produce a point-in-time probability forecast, by fitting a time series regression for the exogenous component, and a through-the-cycle probability forecast, by omitting the exogenous component. We investigate the use of the model for regulatory capital and stress-testing under Basel III, as well as impairment provisioning under IFRS 9. We show that when a Gaussian link function is used the portfolio loss follows a Vaơíček distribution. Furthermore, the asset correlation coefficient (as defined under Basel III) is shown to be a function of the level of systemic risk (which is measured by the variance of the exogenous component) and the extent to which the systemic risk can be modelled (which is measured by the coefficient of determination of the regression model for the exogenous component). The second theme addresses the problem of deriving a portfolio loss distribution from a loan-level model for loss. In most models (including the Basel-Vaơíček regimes), this is done by assuming that the portfolio is infinitely large – resulting in a loss distribution that ignores diversifiable risk. We thus show that, holding all risk parameters constant, this assumption leads to an understatement of the level of risk within a portfolio – particularly for small portfolios. To overcome this weakness, we derive formulae that can be used to partition the portfolio risk into risk that is diversifiable and risk that is systemic. Using these formulae, we derive a loss distribution that better-represents losses under portfolios of all sizes. The third theme is concerned with two separate issues: (a) the problem of model selection in credit risk and (b) the problem of how to accurately measure probability of insolvency in a credit portfolio. To address the first problem, we use the EMV model to study the theoretical properties of the Gini statistic for default risk in a portfolio of loans and derive a formula that estimates the Gini statistic directly from the model parameters. We then show that the formulae derived to estimate the Gini statistic can be used to study the probability of insolvency. To do this, we first show that when capital requirements are determined to target a specific probability of solvency on a through-the-cycle basis, the point-in-time probability of insolvency can be considerably different from the through-the-cycle probability of insolvency – thus posing a challenge from a risk management perspective. We show that the extent of this challenge will be greater for more cyclical loan portfolios. We then show that the formula derived for the Gini statistic can be used to measure the extent of the point-in-time insolvency risk posed by using a through-the-cycle capital regime. The fourth theme considers the problem of survival modelling with time varying covariates. We propose an extension to the Cox regression model, allowing the inclusion of time-varying macroeconomic variables as covariates. The model is specifically applied to estimate the probability of default in a loan portfolio, where the experience is decomposed the experience into three dimensions: (a) a survival time dimension; (b) a behavioural risk dimension; and (c) calendar time dimension. In this regard, the model can also be viewed as an extension of the EMV model – adding a survival time dimension. A model is built for each dimension: (a) the survival time dimension is modelled by a baseline hazard curve; (b) the behavioural risk dimension is modelled by a behavioural risk index; and (c) the calendar time dimension is modelled by a macroeconomic risk index. The model lends itself to application in modelling probability of default under the IFRS 9 regime, where it can produce estimates of probability of default over variable time horizons, while accounting for time-varying macroeconomic variables. However, the model also has a broader scope of application beyond the domains of credit risk and banking. In the fifth and final theme, we introduce the concept of embedded value to a banking context. In longterm insurance, embedded value relates to the expected economic value (to shareholders) of a book of insurance contracts and is used for appraising insurance companies and measuring management's performance. We derive formulae for estimating the embedded value of a portfolio of loans, which we show to be a function of: (a) the spread between the rate charged to the borrower and the cost of funding; (b) the tenure of the loan; and (c) the level of credit risk inherent in the loan. We also show how economic value can be attributed between profits from maturity transformation and profits from credit and liquidity margin. We derive formulae that can be used to analyse the change in embedded value throughout the life of a loan. By modelling the credit loss component of embedded value, we derive a distribution for the economic value of a book of business. The literary contributions made by the thesis are of practical significance. The thesis offers a way for banks and regulators to accurately estimate the value of the asset correlation coefficient in a manner that controls for portfolio size and intertemporal heterogeneity. This will lead to improved precision in determining capital adequacy – particularly for institutions operating in uncertain environments and those operating small credit portfolios – ultimately enhancing the integrity of the financial system. The thesis also offers tools to help bank management appraise the financial performance of their businesses and measure the value created for shareholders
    • 

    corecore