Ensemble methods for meningitis aetiology diagnosis

Abstract

In this work, we explore data-driven techniques for the fast and early diagnosis concerning the etiological origin of meningitis, more specifically with regard to differentiating between viral and bacterial meningitis. We study how machine learning can be used to predict meningitis aetiology once a patient has been diagnosed with this disease. We have a dataset of 26,228 patients described by 19 attributes, mainly about the patient's observable symptoms and the early results of the cerebrospinal fluid analysis. Using this dataset, we have explored several techniques of dataset sampling, feature selection and classification models based both on ensemble methods and on simple techniques (mainly, decision trees). Experiments with 27 classification models (19 of them involving ensemble methods) have been conducted for this paper. Our main finding is that the combination of ensemble methods with decision trees leads to the best meningitis aetiology classifiers. The best performance indicator values (precision, recall and f-measure of 89% and an AUC value of 95%) have been achieved by the synergy between bagging and NBTrees. Nonetheless, our results also suggest that the combination of ensemble methods with certain decision tree clearly improves the performance of diagnosis in comparison with those obtained with only the corresponding decision tree.This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. We would like to thank the Health Department of the Brazilian Government for providing the dataset and for authorizing its use in this study. We would also like to express our gratitude to the reviewers for their thoughtful comments and efforts towards improving our manuscript. Funding for open access charge: Universidad de Málaga / CBUA

    Similar works