98,794 research outputs found

    A comparison study of credit scoring models

    Get PDF
    In this paper we consider a credit scoring problem. We compare three powerful credit scoring models: genetic programming (GP), backpropagation neural networks (BP) and support vector machines (SVM) when applied to this problem, then we give a combined model. The results show that the combined model produces good classification results

    A Comparative Study of Data Mining Techniques for Credit Scoring in Banking

    Get PDF
    Shih-Chen Huang and Min-Yuh Day (2013), "A Comparative Study of Data Mining Techniques for Credit Scoring in Banking", in Proceedings of the IEEE International Conference on Information Reuse and Integration (IEEE IRI 2013), San Francisco, California, USA, August 14-16, 2013, pp. 684-691.[[abstract]]Credit is becoming one of the most important incomes of banking. Past studies indicate that the credit risk scoring model has been better for Logistic Regression and Neural Network. The purpose of this paper is to conduct a comparative study on the accuracy of classification models and reduce the credit risk. In this paper, we use data mining of enterprise software to construct four classification models, namely, decision tree, logistic regression, neural network and support vector machine, for credit scoring in banking. We conduct a systematic comparison and analysis on the accuracy of 17 classification models for credit scoring in banking. The contribution of this paper is that we use different classification methods to construct classification models and compare classification models accuracy, and the evidence demonstrates that the support vector machine models have higher accuracy rates and therefore outperform past classification methods in the context of credit scoring in banking.[[sponsorship]]IEEE[[incitationindex]]EI[[conferencetype]]國際[[conferencedate]]20130814~20130816[[booktype]]電子版[[iscallforpapers]]Y[[conferencelocation]]San Francisco, California, US

    Credit scoring: comparison of non‐parametric techniques against logistic regression

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceOver the past decades, financial institutions have been giving increased importance to credit risk management as a critical tool to control their profitability. More than ever, it became crucial for these institutions to be able to well discriminate between good and bad clients for only accepting the credit applications that are not likely to default. To calculate the probability of default of a particular client, most financial institutions have credit scoring models based on parametric techniques. Logistic regression is the current industry standard technique in credit scoring models, and it is one of the techniques under study in this dissertation. Although it is regarded as a robust and intuitive technique, it is still not free from several critics towards the model assumptions it takes that can compromise its predictions. This dissertation intends to evaluate the gains in performance resulting from using more modern non-parametric techniques instead of logistic regression, performing a model comparison over four different real-life credit datasets. Specifically, the techniques compared against logistic regression in this study consist of two single classifiers (decision tree and SVM with RBF kernel) and two ensemble methods (random forest and stacking with cross-validation). The literature review demonstrates that heterogeneous ensemble approaches have a weaker presence in credit scoring studies and, because of that, stacking with cross-validation was considered in this study. The results demonstrate that logistic regression outperforms the decision tree classifier, has similar performance in relation to SVM and slightly underperforms both ensemble approaches in similar extents

    Система кредитного скорингу позичальників кредитів на основі інтелектуального аналізу даних

    Get PDF
    Магістерська дисертація: 102 с., 25 рис., 25 табл., 1 додаток, 13 джерел. Об’єкт дослідження – позичальники кредитів, представлені статистичними даними з вибраними характеристиками. Предмет дослідження – математичні моделі, методи інтелектуального аналізу даних, критерії оцінювання адекватності скорингових моделей та методи побудови скорингової карти. Методи дослідження – методи інтелектуального аналізу даних, нейронні мережі, регресійний аналіз, статистичні методи аналізу даних, методи класифікації, методи побудови скорингових моделей. Метою роботи є аналіз системи кредитного скорингу на основі методів та моделей інтелектуального аналізу даних, а також, їх порівняння з існуючими методами кредитного скорингу. В роботі проведено огляд основних підходів побудови скорингових моделей, розглянуто та проаналізовано методи нейронних та байєсівських мереж. Було проаналізовано результати моделювання та оцінювання задля обґрунтуваного вибору найкращої моделі для оцінки кредитоспроможності клієнтів. Результатом роботи є визначення методів кредитного скорингу та розробка архітектури системи, розробка якої вплине на зниження кредитного ризику банків, а, відповідно, і підвищення його кредитного рейтингу, що в свою чергу матиме системний вплив на банківську систему України.Master’s thesis: 103 p., 25 fig., 25 tab., 1 application, 13 sources. Object of the study – loans borrowers represented by the statistics of selected characteristics. Subject of research - mathematical models, methods of data analysis, criteria for assessing the adequacy of scoring models and methods of constructing a scorecard. Methods of research - methods of data analysis, neural networks, regression analysis, statistical methods of data analysis, classification methods, methods of constructing scoring models. The aim of the work is to analyze the system of credit scoring on the basis of methods and models of intellectual data analysis, as well as their comparison with existing methods of credit scoring. In the work the review of the main approaches of constructing scoring models was carried out, methods of neural and Bayesian networks were considered and analyzed. The results of modeling and evaluation were analyzed in order to justify the choice of the best model for assessing the creditworthiness of clients. The result of the work is to determine the methods of credit scoring and the development of the architecture of the system, the development of which will affect the reduction of credit risk of banks, and, accordingly, increase its credit rating, which in turn will have a systemic impact on the banking system of Ukraine

    SMEs Credit Risk Modelling for Internal Rating Based Approach in Banking Implementation of Basel II Requirement

    Get PDF
    This thesis explores the modelling for Internal Rating Based (IRB) of Credit Risk for Small and Medium Enterprises (SMEs) as required for implementation of Basel II Accord. There has been limited previous research for this important sector of the economy. There are two major approaches: Accounting Based and Merton Type, and these are compared. To make the comparison initially a small sample is considered and simulation is used to explore the use of the two approaches. The study indicates some of the limitation of analysis for both Accounting Based and Merton Type approaches, for example the issue of colinearity for the Accounting Based approach and lack of trading of SMEs’ equity affecting the Merton Type approach. A large sample is then investigated using standard Credit Scoring approaches for the Accounting Based modelling. Different definitions of default and distress are considered to overcome the problem of low number of defaults. These approaches are found to be viable. Merton Type model is then compared to benchmark models from the Accounting Based approach. The predictions are compared over differing time horizons. It is found that Merton Type models perform well within a limited period compared to the Accounting Base approach. Overall, credit scoring models demonstrated better performance when the sample group included a considerable number of ‘Bad’ firms or cutoff point was selected so that an acceptance rate was relatively low, otherwise model’s predictive accuracy would decline. Merton model presented better predictive accuracy with higher acceptance rates. Credit scoring models was able to give early signs of default year. In addition, one may take into consideration that if the company is going to decline credit quality or raise default probability this year, Merton type models can be helpful in adjusting credit rating. When considering a loan to a company, a bank wants to know the likelihood default for duration of loan. In this sense Merton models is only useful for a relatively short loan terms

    Métodos de seleção de variáveis em modelos de credit scoring

    Get PDF
    Nos últimos anos, houve aumento na demanda e popularização do mercado de crédito no Brasil. A concessão de crédito envolve riscos, o que pode significar um grande prejuízo monetário para as empresas. Sendo assim, surgiram os modelos de crédito, que buscam identificar características que diferenciam o bom e o mau pagador. Os modelos de Credit Scoring são diferenciados conforme a etapa do ciclo de crédito do cliente, sendo divididos geralmente em Application Scoring, Behavioral Scoring e Collection Scoring. Esses modelos são geralmente construídos com base em uma grande quantidade de características (variáveis) dos clientes, pois podem utilizar informações cadastrais, de crédito, de conta corrente e outras. Dessa forma, é necessário um processo refinado de extração e seleção das variáveis mais importantes na explicação do bom e mau pagador. Existem diversas técnicas de mineração de dados que realizam redução da dimensão de dados originais e/ou seleção de variáveis para utilização em modelos estatísticos que incluem a Análise de Componentes Principais e o método Stepwise. Embora amplamente utilizados, nenhum desses dois métodos de seleção de variáveis considera as medidas de desempenho práticas dos modelos (como o resultado do teste KS). Dessa forma, surge a necessidade de verificação da utilização dessas medidas como forma de seleção de variáveis para os modelos de crédito. Sendo assim, o objetivo do presente estudo é comparar modelos de Credit Scoring utilizando diferentes métodos de seleção de variáveis: PCA, Stepwise e um novo método de seleção baseado no resultado do teste KS, denominado como método Eilert. As informações utilizadas no estudo são provenientes de uma base de dados reais, com duas composições amostrais (desenvolvimento e validação), de um total de 240.000 clientes. Os métodos de seleção foram testados com a utilização de 90 variáveis de cadastro e comportamento dos clientes na empresa. Os modelos finais de crédito foram construídos com a técnica de Regressão Logística, e as medidas de desempenho utilizadas para comparação dos resultados foi o percentual de acerto, o resultado do KS e a curva ROC. Na comparação dos modelos de crédito, construídos com as variáveis indicadas pelos três métodos de seleção, verifica-se que os desempenhos dos modelos, tanto na amostra de desenvolvimento quanto de validação, foram semelhantes, com valores de KS em torno de 57%, ROC em torno de 0,85 e percentual de acerto por volta de 93%. Após a seleção de variáveis com os métodos Eilert, PCA e Stepwise com significância de 0,05 e 0,01 os modelos finais apresentaram respectivamente 22, 20, 63 e 56 variáveis. Diante disso, observa-se que os três métodos de seleção de variáveis foram eficazes na redução da dimensão final de variáveis aplicadas nos modelos de Credit Scoring construídos no estudo. Sendo assim, a definição do método adotado deve ser considerada através da facilidade de análise, interpretação e apresentação dos resultados dos modelos.In the last years, there has been an increase in the demand and popularization of the credit market in Brazil. The granting of credit involves risks, which can mean a great deal of monetary loss to companies. Thus, credit models emerged, which seek to identify characteristics that differentiate the good and the bad payer. Credit Scoring models are differentiated according to the stage of the customer's credit cycle, and are usually divided into Application Scoring, Behavioral Scoring and Collection Scoring. These models are usually constructed based on a large number of characteristics (variables) of the clients, since they can use cadastral information, credit, checking account and others. Thus, a refined process of extraction and selection of the most important variables in the explanation of good and bad payer is necessary. There are several data mining techniques that perform the reduction of the original data dimension and / or selection of variables for use in statistical models that include the PCA and the Stepwise method. Although widely used, neither of these two methods of variable selection considers the practical performance measures of the models (such as the KS test result). Thus, there is a need to verify the use of these measures as a way of choosing and selecting variables for credit models. Thus, the objective of the present study is to compare Credit Scoring models using different methods of variable selection: PCA, Stepwise and a new selection method based on the KS test result, denominated Eilert method. The information used in the study comes from a real database, with two sample compositions (development and validation), of a total of 240,000 clients. The selection methods were tested using 90 variables of customer registration and behavior in the company. The final credit models were constructed using the RL technique, and the performance measures used to compare the results were the percentage hit, the KS result and the ROC curve. In the comparison of the credit models constructed with the variables indicated by the three selection methods, it was verified that the performance of the models, both in the development sample and in the validation were similar, with KS values around 57%, ROC around of 0.85 and percentage of accuracy around 93%. However, after selecting variables with the Eilert, PCA and Stepwise methods with significance of 0.05 and 0.01, the final models presented 22, 20, 63 and 56 variables, respectively. Therefore, it is observed that the three methods of variable selection were effective in reducing the final dimension of variables applied in the Credit Scoring models constructed in the study. Therefore, the definition of the adopted method must be considered through the ease of analysis, interpretation and presentation of the results of the models

    Forecasting creditworthiness in retail banking: a comparison of cascade correlation neural networks, CART and logistic regression scoring models

    Get PDF
    The preoccupation with modelling credit scoring systems including their relevance to forecasting and decision making in the financial sector has been with developed countries whilst developing countries have been largely neglected. The focus of our investigation is the Cameroonian commercial banking sector with implications for fellow members of the Banque des Etats de L’Afrique Centrale (BEAC) family which apply the same system. We investigate their currently used approaches to assessing personal loans and we construct appropriate scoring models. Three statistical modelling scoring techniques are applied, namely Logistic Regression (LR), Classification and Regression Tree (CART) and Cascade Correlation Neural Network (CCNN). To compare various scoring models’ performances we use Average Correct Classification (ACC) rates, error rates, ROC curve and GINI coefficient as evaluation criteria. The results demonstrate that a reduction in terms of forecasting power from 15.69% default cases under the current system, to 3.34% based on the best scoring model, namely CART can be achieved. The predictive capabilities of all three models are rated as at least very good using GINI coefficient; and rated excellent using the ROC curve for both CART and CCNN. It should be emphasised that in terms of prediction rate, CCNN is superior to the other techniques investigated in this paper. Also, a sensitivity analysis of the variables identifies borrower’s account functioning, previous occupation, guarantees, car ownership, and loan purpose as key variables in the forecasting and decision making process which are at the heart of overall credit policy

    Would credit scoring work for Islamic finance? A neural network approach

    Get PDF
    Purpose – The main aim of this paper is to distinguish whether the decision making process of the Islamic financial houses in the UK can be improved through the use of credit scoring modeling techniques as opposed to the currently used judgmental approaches. Subsidiary aims are to identify how scoring models can reclassify accepted applicants who later are considered as having bad credit and how many of the rejected applicants are later considered as having good credit; and highlight significant variables that are crucial in terms of accepting and rejecting applicants which can further aid the decision making process. Design/methodology/approach – A real data-set of 487 applicants are used consisting of 336 accepted credit applications and 151 rejected credit applications make to an Islamic finance house in the UK. In order to build the proposed scoring models, the data-set is divided into training and hold-out sub-set. The training sub-set is used to build the scoring models and the hold-out sub-set is used to test the predictive capabilities of the scoring models.70 percent of the overall applicants will be used for the training sub-set and 30 percent will be used for the testing sub-set. Three statistical modeling techniques namely Discriminant Analysis (DA), Logistic Regression (LR) and Multi-layer Perceptron (MP) neural network are used to build the proposed scoring models. Findings – Our findings reveal that the LR model has the highest Correct Classification (CC) rate in the training sub-set whereas MP outperforms other techniques and has the highest CC rate in the hold-out sub-set. MP also outperforms other techniques in terms of predicting the rejected credit applications and has the lowest Misclassification Cost (MC) above other techniques. In addition, results from MP models show that monthly expenses, age and marital status are identified as the key factors affecting the decision making process. Research limitations/implications – Although our sample is small and restricted to an Islamic Finance house in the UK the results are robust. Future research could consider enlarging the sample in the UK and also internationally allowing for cultural differences to be identified. The results indicate that the scoring models can be of great benefit to Islamic finance houses in regards to their decision making processes of accepting and rejecting new credit applications and thus improve their efficiency and effectiveness. Originality/value –Our contribution is the first to apply credit scoring modeling techniques in Islamic Finance. Also in building a scoring model our application applies a different approach by using accepted and rejected credit applications instead of good and bad credit histories. This identifies opportunity costs of misclassifying credit applications as rejected
    corecore