Optimization of the Decision Tree Method using Pruning on Liver Disease Classification

Abstract

The amount of data about liver disease can be used to become information that can be extracted using the decision tree data mining method. However, there is a weakness in the decision tree method, namely over-fitting the resulting tree can produce a good model in training data but normally cannot produce a good tree model when applied to unseen data. Based on experiments conducted using datasets taken from The UCI Machine Learning Repository database is the ILPD dataset which contains 583 clinical data with 10 attributes with a target output of 416 positive liver and 167 negative liver. The results show that the decision tree algorithm using pruning and without pruning has been tested showing an increase in accuracy. The results of the decision tree performance without pruning generated in the confusion matrix for the accuracy measure, which is 73.58 %. While the results of the system performance using the pruning method have an accuracy of 73.76%. Although the accuracy value is slightly adrift, it can prove that the decision tree method using the pruning method has much better accuracy. In addition, the models and rules generated by the decision tree can be used as the basis for developing a prototype application for liver disease classification

    Similar works