1 research outputs found

    A Hybrid Machine Learning Approach for Credit Scoring Using PCA and Logistic Regression

    Get PDF
    Credit scoring is one mechanism used by lenders to evaluate risk before extending credit to credit applicants. The method helps distinguish credit worthiness of good credit applicants from the bad credit applicants.  Credit scoring involves a set of decision models and with their underlying techniques helps aid lenders in issuing of consumer credit. Logistic regression (LR) is an adjustment of linear regression with flexibility on its preposition of data and is also able to handle qualitative indicators. The major shortcoming of Logistic regression model is the inability to deal with cooperative (over fitting) effect of the variables. PCA is a feature extraction model that is used to filter out irrelevant un-needed features and hence, it lowers model training time and costs and also increases model performance. This study evaluates the shortcomings of simple models and proposes to develop an efficient and robust machine learning technique combining Logistic and PCA models to evaluate firms in the deposit taking SACCO sector. To achieve this, experimental methodology is adopted.  The proposed hybrid model will be two staged. First stage will be to transform the original variables to get new uncorrelated variables. This will be done using Principal Component Analysis (PCA). Stage two is the use of LR on the principal component values to compute the credit scores. Inferences and conclusions were made based on the analysis of the collected data using Matlab.
    corecore