A Comparative Study on Hepatitis C Predictions Using Machine Learning Algorithms

Abstract

Hepatitis C virus (HCV) is known to be the major cause of chronic liver disease. Based on research, HCV has caused more than 100.000 cases of liver cancer per year. This virus has become the cause of at least 280.000 deaths. To diagnose HCV, it takes at least two different tests, namely serological assays and molecular tests, which are quite costly and complex. With Machine Learning technology, the diagnosis of any disease or virus can be made by detecting different patterns or relationships. Therefore, this study aims to predict the Hepatitis C virus using different machine learning algorithms and find out the best model for the classification of Hepatitis C disease. Furthermore, this study shows some visualizations to find out the relationships between attributes. We used different machine learning algorithms, namely K-Nearest Neighbour, Support Vector Machine, Random Forest, Neural Network, Naïve Bayes, and Logistic Regression. The performance of those different machine learning algorithms was evaluated using four different metrics, which are classification accuracy, precision, recall, and F-1 score. The classification accuracy results are 96.5%, 96.7%, 97.3%, 97.1%, 96%, 97.9% each for k-NN, SVM, RandomFores, Neural Network, Naïve Bayes and Logistic Regression. Based on the results, each model showed high performance, but Logistic Regression performs the best result. With the results conducted by this study, it is hoped that it can help the diagnosis process of HCV based on laboratory data. However, it is important to communicate the shortcomings and some possible improvements for each model. Keywords: Machine Learning, Predictions, Hepatitis C Viru

    Similar works