5,390 research outputs found

    How Secure Are Good Loans: Validating Loan-Granting Decisions And Predicting Default Rates On Consumer Loans

    Get PDF
    The failure or success of the banking industry depends largely on the industrys ability to properly evaluate credit risk. In the consumer-lending context, the banks goal is to maximize income by issuing as many good loans to consumers as possible while avoiding losses associated with bad loans. Mistakes could severely affect profits because the losses associated with one bad loan may undermine the income earned on many good loans. Therefore banks carefully evaluate the financial status of each customer as well as their credit worthiness and weigh them against the banks internal loan-granting policies. Recognizing that even a small improvement in credit scoring accuracy translates into significant future savings, the banking industry and the scientific community have been employing various machine learning and traditional statistical techniques to improve credit risk prediction accuracy.This paper examines historical data from consumer loans issued by a financial institution to individuals that the financial institution deemed to be qualified customers. The data consists of the financial attributes of each customer and includes a mixture of loans that the customers paid off and defaulted upon. The paper uses three different data mining techniques (decision trees, neural networks, logit regression) and the ensemble model, which combines the three techniques, to predict whether a particular customer defaulted or paid off his/her loan. The paper then compares the effectiveness of each technique and analyzes the risk of default inherent in each loan and group of loans. The data mining classification techniques and analysis can enable banks to more precisely classify consumers into various credit risk groups. Knowing what risk group a consumer falls into would allow a bank to fine tune its lending policies by recognizing high risk groups of consumers to whom loans should not be issued, and identifying safer loans that should be issued, on terms commensurate with the risk of default

    Investigation into the Predictive Capability of Macro-Economic Features in Modelling Credit Risk for Small Medium Enterprises

    Get PDF
    This research project investigates the predictive capability of macro-economic features in modelling credit risk for small medium enterprises (SME/SMEs). There have been indications that there is strong correlation between economic growth and the size of the SME sector in an economy. However, since the financial crisis and consequent policies and regulations, SMEs have been hampered in attempts to access credit. It has also been noted that while there is a substantial amount of credit risk literature, there is little research on how macro-economic factors affect credit risk. Being able to improve credit scoring by even a small amount can have a very positive effect on a financial institution\u27s profits, reputation and ability to support the economy. Typically, in the credit scoring process two methods of scoring are carried out, application scoring model and behavioural scoring model. These models for predicting customers who are likely to default usually rely upon financial, demographic and transactional data as the predictive inputs. This research investigates the use of a much coarser source of data at a macro-economic level by a low level and high level regions in Ireland. Features such as level of employment/unemployment, education attainment, consumer spending trends and default levels by different banking products will be evaluated as part of the research project. In the course of this research, techniques and methods are established for evaluating the usefulness of macro-economic features. These are subsequently introduced into the predictive models to be evaluated. It was found that while employing coarse classification and subsequently choosing the macro-economic features with the highest information value in the predictive model, the accuracy across all performance measures improved significantly. This has proven that macro-economic features have the potential to be used in modelling credit risk for SMEs in the future

    Support Vector Machines for Credit Scoring and discovery of significant features

    Get PDF
    The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit scoring for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default. 1

    Supervised and unsupervised data mining approaches in loan default prediction

    Get PDF
    Given the paramount importance of data mining in organizations and the possible contribution of a data-driven customer classification recommender systems for loan-extending financial institutions, the study applied supervised and supervised data mining approaches to derive the best classifier of loan default. A total of 900 instances with determined attributes and class labels were used for the training and cross-validation processes while prediction used 100 new instances without class labels. In the training phase, J48 with confidence factor of 50% attained the highest classification accuracy (76.85%), k-nearest neighbors (k-NN) 3 the highest (78.38%) in IBk variants, naïve Bayes has a classification accuracy of 76.65%, and logistic has 77.31% classification accuracy. k-NN 3 and logistic have the highest classification accuracy, F-measures, and kappa statistics. Implementation of these algorithms to the test set yielded 48 non-defaulters and 52 defaulters for k -NN 3 while 44 non-defaulters and 56 defaulters under logistic. Implications were discussed in the paper

    Forecasting creditworthiness in retail banking: a comparison of cascade correlation neural networks, CART and logistic regression scoring models

    Get PDF
    The preoccupation with modelling credit scoring systems including their relevance to forecasting and decision making in the financial sector has been with developed countries whilst developing countries have been largely neglected. The focus of our investigation is the Cameroonian commercial banking sector with implications for fellow members of the Banque des Etats de L’Afrique Centrale (BEAC) family which apply the same system. We investigate their currently used approaches to assessing personal loans and we construct appropriate scoring models. Three statistical modelling scoring techniques are applied, namely Logistic Regression (LR), Classification and Regression Tree (CART) and Cascade Correlation Neural Network (CCNN). To compare various scoring models’ performances we use Average Correct Classification (ACC) rates, error rates, ROC curve and GINI coefficient as evaluation criteria. The results demonstrate that a reduction in terms of forecasting power from 15.69% default cases under the current system, to 3.34% based on the best scoring model, namely CART can be achieved. The predictive capabilities of all three models are rated as at least very good using GINI coefficient; and rated excellent using the ROC curve for both CART and CCNN. It should be emphasised that in terms of prediction rate, CCNN is superior to the other techniques investigated in this paper. Also, a sensitivity analysis of the variables identifies borrower’s account functioning, previous occupation, guarantees, car ownership, and loan purpose as key variables in the forecasting and decision making process which are at the heart of overall credit policy

    Using Memory-Based Reasoning For Predicting Default Rates On Consumer Loans

    Get PDF
    In recent years, financial institutions have struggled with high default rates for consumer lending. An ability to reliably predict the probability of consumer loan defaults would have a significant impact of the profitability of that lending for these institutions. In response to this need, the financial institutions have employed loan analysis techniques such as logistic regression, discriminant analysis, and various machine learning techniques to improve the accuracy of detecting loan defaults.  The objective of these techniques is to more precisely identify creditworthy applicants who are granted credit, thereby increasing profits, from non-creditworthy applicants who would be then denied credit, thus decreasing losses. The objective of this article is to employ an emergent data analysis technique, memory-based or case-based reasoning method, to this problem to test its accuracy in discriminating between good and bad loans. This paper examines historical data from consumer loans issued by a financial institution to individuals that the financial institution considered to be qualified customers.  The data set consists of the financial attributes of each customer and includes a mixture of loans that the customers paid off or defaulted upon. The paper then compares the performance of this technique to other data mining techniques proposed in earlier works and analyzes the risk of default inherent in each loan for each technique

    Statistical modelling to predict corporate default for Brazilian companies in the context of Basel II using a new set of financial ratios

    Get PDF
    This paper deals with statistical modelling to predict failure of Brazilian companies in the light of the Basel II definition of default using a new set of explanatory variables. A rearrangement in the official format of the Balance Sheet is put forward. From this rearrangement a framework of complementary non-conventional ratios is proposed. Initially, a model using 22 traditional ratios is constructed. Problems associated with multicollinearity were found in this model. Adding a group of 6 non-conventional ratios alongside traditional ratios improves the model substantially. The main findings in this study are: (a) logistic regression performs well in the context of Basel II, yielding a sound model applicable in the decision making process; (b) the complementary list of financial ratios plays a critical role in the model proposed; (c) the variables selected in the model show that when current assets and current liabilities are split into two sub-groups - financial and operational - they are more effective in explaining default than the traditional ratios associated with liquidity; and (d) those variables also indicate that high interest rates in Brazil adversely affect the performance of those companies which have a higher dependency on borrowing

    Credit-Scoring Methods (in English)

    Get PDF
    The paper reviews the best-developed and most frequently applied methods of credit scoring employed by commercial banks when evaluating loan applications. The authors concentrate on retail loans – applied research in this segment is limited, though there has been a sharp increase in the volume of loans to retail clients in recent years. Logit analysis is identified as the most frequent credit-scoring method used by banks. However, other nonparametric methods are widespread in terms of pattern recognition. The methods reviewed have potential for application in post-transition countries.banking sector, credit scoring, discrimination analysis, pattern recognition, retail loans

    The credit risk evaluation models: an application of data mining techniques

    Get PDF
    In the banking sector, credit risk assessment is an important operation in ensuring that loans could be paid on time, and banks could maintain their credit performance effectively; despite restless business efforts allocated to credit scoring yearly, high percentage of loan defaulting remains a major issue. With the availability of tremendous banking data and advanced analytics tools, classification data mining algorithms can be applied to develop a platform of credit scoring and to resolve the loan defaulting problem. With the dataset of 5,960 observations representing information about characteristics of underlying-collateral loans, the paper sets out a data mining process to compare four classification algorithms, including logistic regression, decision tree, neural network, and XGboost in performance. Via the confusion matrix and Monte Carlo simulation benchmarks, the XGboost outperforms as the most accurate and profitable model, displaying a high consistency about the major factors which could be attributable for default possibilities of the credit scoring
    corecore