2,593 research outputs found

    Profit Maximizing Logistic Regression Modeling for Credit Scoring

    Get PDF
    Multiple classification techniques have been employed for different business applications. In the particular case of credit scoring, a classifier which maximizes the total profit is preferable. The recently proposed expected maximum profit (EMP) measure for credit scoring allows to select the most profitable classifier. Taking the idea of the EMP one step further, it is desirable to integrate the measure into model construction, and thus obtain a profit maximizing model. Therefore, in this work we propose a method based on the ProfLogit classifier, which optimizes the coefficients of a logistic regression model using a genetic algorithm. The proposed implemented technique shows a significant improvement compared to regular maximum likelihood based logistic regression models on real-life data sets in terms of total profit, which is the ultimate goal for most businesses.</p

    Ensemble of Example-Dependent Cost-Sensitive Decision Trees

    Get PDF
    Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples and not only within classes. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. In previous works, some methods that take into account the financial costs into the training of different algorithms have been proposed, with the example-dependent cost-sensitive decision tree algorithm being the one that gives the highest savings. In this paper we propose a new framework of ensembles of example-dependent cost-sensitive decision-trees. The framework consists in creating different example-dependent cost-sensitive decision trees on random subsamples of the training set, and then combining them using three different combination approaches. Moreover, we propose two new cost-sensitive combination approaches; cost-sensitive weighted voting and cost-sensitive stacking, the latter being based on the cost-sensitive logistic regression method. Finally, using five different databases, from four real-world applications: credit card fraud detection, churn modeling, credit scoring and direct marketing, we evaluate the proposed method against state-of-the-art example-dependent cost-sensitive techniques, namely, cost-proportionate sampling, Bayes minimum risk and cost-sensitive decision trees. The results show that the proposed algorithms have better results for all databases, in the sense of higher savings.Comment: 13 pages, 6 figures, Submitted for possible publicatio

    Bankruptcy Prediction of Small and Medium Enterprises Using a Flexible Binary Generalized Extreme Value Model

    Full text link
    We introduce a binary regression accounting-based model for bankruptcy prediction of small and medium enterprises (SMEs). The main advantage of the model lies in its predictive performance in identifying defaulted SMEs. Another advantage, which is especially relevant for banks, is that the relationship between the accounting characteristics of SMEs and response is not assumed a priori (e.g., linear, quadratic or cubic) and can be determined from the data. The proposed approach uses the quantile function of the generalized extreme value distribution as link function as well as smooth functions of accounting characteristics to flexibly model covariate effects. Therefore, the usual assumptions in scoring models of symmetric link function and linear or pre-specied covariate-response relationships are relaxed. Out-of-sample and out-of-time validation on Italian data shows that our proposal outperforms the commonly used (logistic) scoring model for different default horizons

    Support Vector Machines for Credit Scoring and discovery of significant features

    Get PDF
    The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit scoring for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default. 1

    Survival Analysis in LGD Modeling

    Get PDF
    The paper proposes an application of the survival time analysis methodology to estimations of the Loss Given Default (LGD) parameter. The main advantage of the survival analysis approach compared to classical regression methods is that it allows exploiting partial recovery data. The model is also modified in order to improve performance of the appropriate goodness of fit measures. The empirical testing shows that the Cox proportional model applied to LGD modeling performs better than the linear and logistic regressions. In addition a significant improvement is achieved with the modified “pseudo” Cox LGD model.credit risk, recovery rate, loss given default, correlation, regulatory capital

    Development and application of consumer credit scoring models using profit-based classification measures

    No full text
    This paper presents a new approach for consumer credit scoring, by tailoring a profit-based classification performance measure to credit risk modeling. This performance measure takes into account the expected profits and losses of credit granting and thereby better aligns the model developers' objectives with those of the lending company. It is based on the Expected Maximum Profit (EMP) measure and is used to find a trade-off between the expected losses -- driven by the exposure of the loan and the loss given default -- and the operational income given by the loan. Additionally, one of the major advantages of using the proposed measure is that it permits to calculate the optimal cutoff value, which is necessary for model implementation. To test the proposed approach, we use a dataset of loans granted by a government institution, and benchmarked the accuracy and monetary gain of using EMP, accuracy, and the area under the ROC curve as measures for selecting model parameters, and for determining the respective cutoff values. The results show that our proposed profit-based classification measure outperforms the alternative approaches in terms of both accuracy and monetary value in the test set, and that it facilitates model deployment

    Development and application of consumer credit scoring models using profit-based classification measures

    No full text
    This paper presents a new approach for consumer credit scoring, by tailoring a profit-based classification performance measure to credit risk modeling. This performance measure takes into account the expected profits and losses of credit granting and thereby better aligns the model developers' objectives with those of the lending company. It is based on the Expected Maximum Profit (EMP) measure and is used to find a trade-off between the expected losses -- driven by the exposure of the loan and the loss given default -- and the operational income given by the loan. Additionally, one of the major advantages of using the proposed measure is that it permits to calculate the optimal cutoff value, which is necessary for model implementation. To test the proposed approach, we use a dataset of loans granted by a government institution, and benchmarked the accuracy and monetary gain of using EMP, accuracy, and the area under the ROC curve as measures for selecting model parameters, and for determining the respective cutoff values. The results show that our proposed profit-based classification measure outperforms the alternative approaches in terms of both accuracy and monetary value in the test set, and that it facilitates model deployment

    "Can Banks Learn to Be Rational?"

    Get PDF
    Can banks learn to be rational in their lending activities? The answer depends on the institutionally bounded constraints to learning. From an evolutionary perspective the functionality (for survival) of "learning to be rational" creates strong incentives for such learning without, however, guaranteeing that each member of the particular economic species actually achieves increased fitness. I investigate this issue for a particular economic species, namely, commrercial banks. The purpose of this paper is to illustrate the key issues related to learning in an economic model by proposing a new screening model for bank commercial loans that uses the neuro fuzzy technique. The technical modeling aspect is integrally connected in a rigorous way to the key conceptual and theoretical aspects of the capabilities for learning to be rational in a broad but precise sense. This paper also compares the relative predictability of loan default among three methods of prediction--- discriminant analysis, logit type regression, and neuro fuzzy--- based on the real data obtained from one of the banks in Taiwan.The neuro fuzzy model, in contrast with the other two, incorporates recursive learning in a real world, imprecise linguistic environment. The empirical results show that in addition to its better screening ability, the neuro fuzzy model is superior in explaining the relationship among the variables as well. With further modifications,this model could be used by bank regulatory agencies for loan examination and by bank loan officers for loan review. The main theoretical conclusion to draw from this demonstration is that non-linear learning in a vague semantic world is both possible and useful. Therefore the search for alternatives to the full neoclassical rationality and its equivalent under uncertainty---rational expectations--- is a plausible and desirable search, especially when the probability for convergence to a rational expectations equilibrium is low.

    An Analysis of Accuracy using Logistic Regression and Time Series

    Get PDF
    This paper analyzes the accuracy rates for logistic regression and time series models. It also examines a relatively new performance index that takes into consideration the business assumptions of credit markets. Although prior research has focused on evaluation metrics, such as AUC and Gini index, this new measure has a more intuitive interpretation for various managers and decision makers and can be applied to both Logistic and Time Series models
    • 

    corecore