2 research outputs found

    On Improving Performance of the Binary Logistic Regression Classifier

    Full text link
    Logistic Regression, being both a predictive and an explanatory method, is one of the most commonly used statistical and machine learning method in almost all disciplines. There are many situations, however, when the accuracies of the fitted model are low for predicting either the success event or the failure event. Several statistical and machine learning approaches exist in the literature to handle these situations. This thesis presents several new approaches to improve the performance of the fitted model, and the proposed methods have been applied to real datasets. Transformations of predictors is a common approach in fitting multiple linear and binary logistic regression models. Binary logistic regression is heavily used by the credit industry for credit scoring of their potential customers, and almost always uses predictor transformations before fitting a logistic regression model. The first improvement proposed here is the use of point biserial correlation coefficient in predictor transformation selection. The second problem presented in this thesis is the application of the Bayesian method in fitting a logistic regression model. The problem of improving the performance of the logistic regression classifier for the minority event cases is also considered in this thesis. Two different clustering-based methods are developed: (i) the method of selective bootstrap, which oversamples cases from the minority class that best represent the minority class, and (ii) the method of clustered parametric or nonparametric simulation to oversample the minority cases. Both of these approaches are applied to real world datasets and significantly improve the predictive accuracies. The results from the proposed methods have been presented at International conferences, and three articles have been published in peer-reviewed journals, with one article submitted for publication
    corecore