205 research outputs found

    Bayesian forecasting of Prepayment Rates for Individual Pools of Mortgages

    Get PDF
    This paper proposes a novel approach for modeling prepayment rates of individual pools of mortgages. The model incorporates the empirical evidence that prepayment is past dependent via Bayesian methodology. There are many factors that influence the prepayment behavior and for many of them there is no available (or impossible to gather) information. We implement this issue by creating a Bayesian mixture model and construct a Markov Chain Monte Carlo algorithm to estimate the parameters. We assess the model on a data set from the Bloomberg Database. Our results show that the burnout effect is a significant variable for explaining normal prepayment activities. This result does not hold when prepayment is triggered by non-pool dependent events. We show how to use the new model to compute prices for Mortgage Backed Securities. Monte Carlo simulation is the traditional method for obtaining such prices and the proposed model can be easily incorporated within simulation pricing framework. Prices for standard Pass-Throughs are obtained using simulation.State of Texas Advanced Research Program 003658-0763National Science Foundation CMMI-0457558, DMS-0605102Civil, Architectural, and Environmental Engineerin

    Predicting prepayment in home loans: Modelling full and partial prepayment in the Portuguese banking sector using machine learning methods

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceExiste um pré-pagamento quando ocorre um reembolso antecipado de um empréstimo por parte do tomador, i.e., o tomador paga mais que o montante contratual acordado. Tal pode ocorrer como parte do principal em dívida (reembolso parcial) ou o valor total do principal em dívida (reembolso total). Do ponto de vista de um banco, o estudo do reembolso antecipado - seja total ou parcial - é importante, pois resulta numa mudança nos fluxos de caixa calendarizados. Em particular, há uma diminuição nos fluxos de caixa futuros resultantes de um evento futuro desconhecido. Assim, o principal objetivo deste estudo é a modelação dos eventos de pré-pagamento no crédito à habitação de um grande banco português, através de uma abordagem de machine learning, avaliando o seu desempenho através da utilização de técnicas como a Area Under the Receiver Operating Characteristic Curve (ROC), o gain or lift e Kolmogorov-Smirnov. Tal permite o estudo do fenómeno das amortizações antecipadas (ou pré-pagamentos) no mercado Português, utilizando dados reais, e através de modelos de machine learning. Uma vez que foram utilizados dados reais, a primeira parte deste estudo prendeu-se com o préprocessamento dos dados, de modo a garantir que os modelos não incluíam ruído e problemas de qualidade de dados. A segunda parte prendeu-se com a computação dos modelos de machine learning, testando modelos de artificial neural network e random forest, com a comparação da performance destes através de métricas como o ROC, gain or lift e Kolmogorov-Smirnov. Os resultados obtidos revelam que os modelos de pré-pagamento total e parcial apresentam bom desempenho nas três métricas de desempenho analisadas. Ambos os modelos apresentam resultados positivos e demonstram que os modelos apresentam bons resultados preditivos e capacidade discriminatória, sendo o modelo de amortização parcial superior ao modelo de amortização total, com uma diferença que, embora não muito grande, merece destaque. Este estudo é particularmente relevante dada a sua análise num banco português, e a aplicação de modelos de machine learning na modelação de pré-pagamento, para os quais os estudos são escassos. Por outro lado, têm recentemente ocorrido esforços (por parte do banco onde o estudo se encontra incluído) para a atualização dos modelos tradicionais atualmente em vigor.There is a loan prepayment when there is an early repayment of a loan from the borrower, i.e. the borrower pays more than the contractual amount due. The repayment may be part of the outstanding principal (partial repayment) or the total principal outstanding (full repayment). From a Bank’s perspective, the study of early repayment – be it full or partial – is relevant as they result in a change in the schedule cash flows. In particular, there is a decrease in the future cash flows resulting from an unknown future event. Hence, the primary purpose of this study is the modelling of the prepayment events in the mortgage loans of a large Portuguese bank, through a machine learning approach, assessing its performance through the use of techniques such as the Area Under the Receiver Operating Characteristic Curve (ROC), the Gain or Lift, and Kolmogorov-Smirnov statistic. This allows for the test of the prepayment phenomena in the Portuguese reality, using real Bank data, and through the use of machine learning models. As there was a use of real-life data, the first part of this study implied the pre-processing of the data, to ensure that the noise and data quality problems were not part of the models. The second stage implied the computation of the machine learning models, which occurred through the testing of Artificial Neural Network and Random Forest models, with the comparison of its performance using the ROC, Gain or Lift and Kolmogorov-Smirnov statistic. The results obtained reveal that both the total and partial prepayment models perform well in all the three performance metrics analysed. Both models present positive results and demonstrate that the models have good predictive results and discriminatory capacity. The partial repayment model is superior to the full repayment model, with a difference that is worthy of mention although not very large. This study is particularly relevant given its analysis in a Portuguese bank and the application of machine learning models in modelling prepayment, for which studies are scarce. Furthermore, there have been occurring efforts (in the bank where this study is framed) to update the traditional models currently in force

    Multiple Event Incidence and Duration Analysis for Credit Data Incorporating Non-Stochastic Loan Maturity

    Get PDF
    Applications of duration analysis in Economics and Finance exclusively employ methods for events of stochastic duration. In application to credit data, previous research incorrectly treats the time to pre-determined maturity events as censored stochastic event times. The medical literature has binary parametric ‘cure rate’ models that deal with populations that never experienced the modelled event. We propose and develop a Multinomial parametric incidence and duration model, incorporating such populations. In the class of cure rate models, this is the first fully parametric multinomial model and is the first framework to accommodate an event with pre-determined duration. The methodology is applied to unsecured personal loan credit data provided by one of Australia’s largest financial services organizations. This framework is shown to be more flexible and predictive through a simulation and empirical study that reveals: simulation results of estimated parameters with a large reduction in bias; superior forecasting of duration; explanatory variables can act in different directions upon incidence and duration; and, variables exist that are statistically significant in explaining only incidence or duration

    Impacts of extreme weather events on mortgage risks and their evolution under climate change:A case study on Florida

    Get PDF
    International audienceWe develop an additive Cox proportional hazard model with time-varying covariates, including spatio-temporal characteristics of weather events, to study the impact of weather extremes (heavy rains and tropical cyclones) on the probability of mortgage default and prepayment. We compare the survival model with a flexible logistic model and an extreme gradient boosting algorithm. We estimate the models on a portfolio of mortgages in Florida, consisting of 69,046 loans and 3,707,831 loan-month observations with localization data at the five-digit ZIP code level. We find a statistically significant and non-linear impact of tropical cyclone intensity on default as well as a significant impact of heavy rains in areas with large exposure to flood risks. These findings confirm existing results in the literature and also provide estimates of the impact of the extreme event characteristics on mortgage risk, e.g. the impact of tropical cyclones on default more than doubles in magnitude when moving from a hurricane of category two to a hurricane of category three or more. We build on the identified effect of exposure to flood risk (in interaction with heavy rainfall) on mortgage default to perform a scenario analysis of the future impacts of climate change using the First Street flood model, which provides projections of exposure to floods in 2050 under RCP 4.5. We find a systematic increase in risk under climate change that can vary based on the scenario of extreme events considered. Climate-adjusted credit risk allows risk managers to better evaluate the impact of climate-related risks on mortgage portfolios

    Survival Analysis for Credit Scoring: Incidence and Latency

    Get PDF
    Duration analysis is an analytical tool for time-to-event data that has been borrowed from medicine and engineering to be applied by econometricians to investigate typical economic and finance problems. In applications to credit data, time to the pre-determined maturity events have been treated as censored observations for the events with stochastic latency. A methodology, motivated by the cure rate model framework, is developed in this paper to appropriately analyse a set of mutually exclusive terminal events where at least one event may have a predetermined latency. The methodology is applied to a set of personal loan data provided by one of Australia's largest financial services institutions. This is the first framework to simultaneously model prepayment, write off and maturity events for loans. Furthermore, in the class of cure rate models it is the first fully parametric multinomial model and the first to accommodate for an event with pre-determined latency. The simulation study found this model performed better than the two most common applications of survival analysis to credit data. In addition, the result of the application to personal loans data reveals particular explanatory variables can act in different directions upon incidence and latency of an event and variables exist that may be statistically significant in explaining only incidence or latency

    Joint modelling of longitudinal and survival data for dynamic prediction in credit-related applications

    Get PDF
    Lenders monitor their borrowers over time, allowing them to dynamically predict the probability of an event of interest, such as default. The widely used survival models focus on when the event happens and can handle time-varying covariates (TVCs) and censored observations. However, an issue little addressed in the literature is that the model specification and the predictive framework depend on the type of TVC included. TVCs can be either exogenous or endogenous to the survival time. Exogenous are those whose future paths are not affected by the event’s occurrence, such as macroeconomic variables. Endogenous, on the contrary, are those whose paths are influenced by the survival status. An example of the latter would be the unpaid principal balance when the event is the default. This thesis explores new mathematical models in credit-related applications, known as joint models of longitudinal and survival data. Initially developed in medical research, these models, in their standard version, are formed by two sub-models, one for the survival process and the other for the endogenous TVC (also named longitudinal outcome in this context). A latent structure links the sub-models, commonly in the form of random effects. Joint models have two advantages compared to survival models. First, they allow us to handle possible endogeneities in the TVCs. Second, by jointly modelling both processes, they offer us a dynamic prediction framework that incorporates their mutual evolution. We propose a series of innovations to make the approach appropriate to creditrelated applications. These innovations relate to the nature of survival time, the specific evolution of the TVCs, ways to scale the technique to large datasets and how to leverage the available data in the modelling framework. In concrete, we adapt the formulation of the joint models and their performance metrics to the discrete nature of the loan data. In addition, we include autoregressive terms in the TVC specification to address observed serial correlation and enhance predictive capability. Moreover, we can study more complex specifications with larger datasets by reformulating the approach within the INLA framework, a fast and accurate algorithm for Bayesian inference. Among these specifications are the joint models with more than one TVC and the joint model that leverages geographical information to include spatial and spatio-temporal effects in the hazard function. We also introduce a more accurate way to estimate individual survival predictions using the Laplace method. Finally, to compare different models, we propose a computationally efficient implementation of the cross-entropy estimate of the posterior predictive conditional density that uses the estimates obtained in the inference step. We apply joint models to predict the time to credit events in the following three settings: default in US mortgages, full prepayment in a German consumer loan portfolio, and full prepayment in US mortgages. The main empirical results show that the autoregressive terms in the joint model let us achieve better discrimination performance, the predictive ability is significantly enhanced compared to survival models when more TVCs are considered, and the inclusion of spatial effects consistently leads to better data representation

    ESSAYS ON INFORMATION ASYMMETRY IN THE U.S. RESIDENTIAL MORTGAGE MARKET: INCENTIVES AND ESTIMATIONS

    Get PDF
    This dissertation focuses on a phenomenon called appraisal bias in the residential mortgage market that stemmed from information asymmetry. It is composed of two essays, one theoretical and one empirical. The theoretical essay analyzes the existence of appraisal bias in a dynamic game of incomplete information framework and solves for the perfect Bayesian equilibria. It establishes how adding semi-verifiability condition to a cheap-talk game helps construct non-babbling equilibria in an asymmetric information environment. The empirical essay quantifies appraisal bias at individual loan level and measures its effect on mortgage terminations. It tests the extent to which option theory explains default and prepayment behavior in residential mortgage market. It treats default and prepayment hazards as dependent competing risks and jointly estimates mortgage terminations in a competing risk proportional hazard model framework and controls for unobserved heterogeneity using Heckman-Singer nonparametric method. It replaces the inaccurate approximated likelihood function that has been applied so far on competing risks analysis with an exact likelihood function. Armed with repeat sale transaction data, this paper is the first to analyze the effect of appraisal bias on mortgage terminations. It concludes that appraisal bias is important in determining mortgage terminations and needs to be controlled for to correctly estimate termination hazards
    • …
    corecore