Employee Churn Prediction using Logistic Regression and Support Vector Machine

Abstract

It is a challenge for Human Resource (HR) team to retain their existing employees than to hire a new one. For any company, losing their valuable employees is a loss in terms of time, money, productivity, and trust, etc. This loss could be possibly minimized if HR could beforehand find out their potential employees who are planning to quit their job hence, we investigated solving the employee churn problem through the machine learning perspective. We have designed machine learning models using supervised and classification-based algorithms like Logistic Regression and Support Vector Machine (SVM). The models are trained with the IBM HR employee dataset retrieved from https://kaggle.com and later fine-tuned to boost the performance of the models. Metrics such as precision, recall, confusion matrix, AUC, ROC curve were used to compare the performance of the models. The Logistic Regression model recorded an accuracy of 0.67, Sensitivity of 0.65, Specificity of 0.70, Type I Error of 0.30, Type II Error of 0.35, and AUC score of 0.73 where as SVM achieved an accuracy of 0.93 with Sensitivity of 0.98, Specificity of 0.88, Type I Error of 0.12, Type II Error of 0.01 and AUC score of 0.96

    Similar works