research

Naive bayes classification of uncertain data

Abstract

Traditional machine learning algorithms assume that data are exact or precise. However, this assumption may not hold in some situations because of data uncertainty arising from measurement errors, data staleness, and repeated measurements, etc. With uncertainty, the value of each data item is represented by a probability distribution function (pdf). In this paper, we propose a novel naive Bayes classification algorithm for uncertain data with a pdf. Our key solution is to extend the class conditional probability estimation in the Bayes model to handle pdf's. Extensive experiments on UCI datasets show that the accuracy of naive Bayes model can be improved by taking into account the uncertainty information. © 2009 IEEE.published_or_final_versionThe 9th IEEE International Conference on Data Mining (ICDM), Miami, FL., 6-9 December 2009. In Proceedings of the 9th ICDM, 2009, p. 944-94

    Similar works