545 research outputs found

    How Secure Are Good Loans: Validating Loan-Granting Decisions And Predicting Default Rates On Consumer Loans

    Get PDF
    The failure or success of the banking industry depends largely on the industrys ability to properly evaluate credit risk. In the consumer-lending context, the banks goal is to maximize income by issuing as many good loans to consumers as possible while avoiding losses associated with bad loans. Mistakes could severely affect profits because the losses associated with one bad loan may undermine the income earned on many good loans. Therefore banks carefully evaluate the financial status of each customer as well as their credit worthiness and weigh them against the banks internal loan-granting policies. Recognizing that even a small improvement in credit scoring accuracy translates into significant future savings, the banking industry and the scientific community have been employing various machine learning and traditional statistical techniques to improve credit risk prediction accuracy.This paper examines historical data from consumer loans issued by a financial institution to individuals that the financial institution deemed to be qualified customers. The data consists of the financial attributes of each customer and includes a mixture of loans that the customers paid off and defaulted upon. The paper uses three different data mining techniques (decision trees, neural networks, logit regression) and the ensemble model, which combines the three techniques, to predict whether a particular customer defaulted or paid off his/her loan. The paper then compares the effectiveness of each technique and analyzes the risk of default inherent in each loan and group of loans. The data mining classification techniques and analysis can enable banks to more precisely classify consumers into various credit risk groups. Knowing what risk group a consumer falls into would allow a bank to fine tune its lending policies by recognizing high risk groups of consumers to whom loans should not be issued, and identifying safer loans that should be issued, on terms commensurate with the risk of default

    Does Removing/Replacing Missing Values Improve The Models' Classification Performances?

    Get PDF
    The paper explores the effect of removing/replacing missing values on the classification performance of several models. The original data set, which contains a relatively large number of missing values, comes from the credit scoring context. This data set was not used to build the models, but it was converted to five other data sets with missing values either removed or replaced using different techniques. The models were built and tested on the five data sets. Preliminary computer simulation showed that the models created and tested on the four data sets in which missing values were replaced exhibited significantly better predictive performance than the model built and tested on the data set with missing values removed

    Does Feature Reduction Help Improve the Classification Accuracy Rates? A Credit Scoring Case Using a German Data Set

    Get PDF
    The paper broadly discusses the data reduction and data transformation issues which are important tasks in the knowledge discovery process and data mining. In general, these activities improve the performance of predictive models. In particular, the paper investigates the effect of feature reduction on classification accuracy rates. A preliminary computer simulation performed on a German data set drawn from the credit scoring context shows mixed results. The six models built on the data set with four independent features perform generally worse than the models created on the same data set with all 20 input features.   &nbsp

    Optimization Problems And Genetic Algorithms

    Get PDF
    This paper presents an application of genetic algorithms (GAs) to a well-known traveling salesman problem (TSP) which is a challenging optimization task. Using the techniques of selection, crossover, and mutation borrowed from the Darwin’s evolution theory, GAs were able to find the optimal solution after generating only 24 populations of solutions instead of exploring more than a million possible solutions

    Rule Induction Methods For Credit Scoring

    Get PDF
    Credit scoring is the term used by the credit industry to describe methods used for classifying applicants for credit into risk classes according to their likely repayment behavior (e.g. ā€œdefaultā€ and ā€œnon-defaultā€).Ā  The credit industry has been using such methods as logistic regression, discriminant analysis, and various machine learning techniques to more precisely identify creditworthy applicants who are granted credit, and non-creditworthy applicants who are denied credit.Ā  Accurate classification is of benefit both to the creditor (in terms of increased profit or reduced loss) and to the loan applicant (avoiding overcommitment).Ā  This paper examines historical data from consumer loans issued by a financial institution to individuals that the financial institution deemed to be qualified customers.Ā  The data set consists of the financial attributes of each customer and includes a mixture of loans that the customers paid off or defaulted upon.Ā  The paper uses rule induction methods (decision trees) to predict whether a particular applicant paid off or defaulted upon his/her loan.Ā  The main advantage of decision trees is their ability to generate if-then classification rules which are intuitive and easy to understand. Rules could be explained to business managers who would need to approve their implementation as well as loan applicants as the reason for denying a loan.Ā  The paper compares the correct classification accuracy rates of several decision tree algorithms with other data mining methods proposed in earlier works

    An Investigation Of The Effect Of Variable Reduction On Classification Accuracy Rates Of Consumer Loans

    Get PDF
    The profitability of loan granting institutions depends largely on the institutionsā€™ ability to accurately evaluate credit risk. Their goal is to maximize income by issuing as many good loans to consumers as possible while minimizing losses associated with bad loans. Financial institutions have been using various computational intelligence methods and statistical techniques to improve credit risk prediction accuracy. This paper examines historical data from consumer loans issued by a German bank to individuals. The data consists of the financial attributes of each customer and includes a mixture of loans that the customers paid off and defaulted upon. This paper examines and compares the classification effectiveness of four computational intelligence techniques: 1) logistic regression (LR), 2) neural networks (NNs), 3) support vector machines (SVM), and 4) k-nearest neighbor (kNN) on three data sets to predict whether a consumer defaulted or paid off a loan. The first data set contains a full set of 20 input variables. The second and third data sets contain a reduced set of ten and six variables, respectively. The results from computer simulation show a limited effect of variable reduction on improvement in the classification performance

    An Adaptive Neuro-Fuzzy Inference System Based Approach to Real Estate Property Assessment

    Get PDF
    This paper describes a first effort to design and implement an adaptive neuro-fuzzy inference system based approach to estimate prices for residential properties. The data set consists of historic sales of homes in a market in Midwest USA and it contains parameters describing typical residential property features and the actual sale price. The study explores the use of fuzzy inference systems to assess real estate property values and the use of neural networks in creating and fine tuning the fuzzy rules used in the fuzzy inference system. The results are compared with those obtained using a traditional multiple regression model. The paper also describes possible future research in this area.
    • ā€¦
    corecore