133 research outputs found

    Robust Bayes classifiers

    Get PDF
    AbstractNaive Bayes classifiers provide an efficient and scalable approach to supervised classification problems. When some entries in the training set are missing, methods exist to learn these classifiers under some assumptions about the pattern of missing data. Unfortunately, reliable information about the pattern of missing data may be not readily available and recent experimental results show that the enforcement of an incorrect assumption about the pattern of missing data produces a dramatic decrease in accuracy of the classifier. This paper introduces a Robust Bayes Classifier (rbc) able to handle incomplete databases with no assumption about the pattern of missing data. In order to avoid assumptions, the rbc bounds all the possible probability estimates within intervals using a specialized estimation method. These intervals are then used to classify new cases by computing intervals on the posterior probability distributions over the classes given a new case and by ranking the intervals according to some criteria. We provide two scoring methods to rank intervals and a decision theoretic approach to trade off the risk of an erroneous classification and the choice of not classifying unequivocally a case. This decision theoretic approach can also be used to assess the opportunity of adopting assumptions about the pattern of missing data. The proposed approach is evaluated on twenty publicly available databases

    Bayesian Clustering by Dynamics

    Get PDF
    This paper introduces a Bayesian method for clustering dynamic processes. The method models dynamics as Markov chains and then applies an agglomerative clustering procedure to discover the most probable set of clusters capturing different dynamics. To increase ef£ciency, the method uses an entropy-based heuristic search strategy. A controlled experiment suggests that the method is very accurate when applied to artificial time series in a broad range of conditions and, when applied to clustering sensor data from mobile robots, it produces clusters that are meaningful in the domain of application

    Robust outcome prediction for intensive care patients

    Get PDF
    Missing data are a major plague of medical databases in general, and of Intensive Care Units databases in particular. The time pressure of work in an Intensive Care Unit pushes the physicians to omit randomly or selectively record data. These different omission strategies give rise to different patterns of missing data and the recommended approach of completing the database using median imputation and fitting a logistic regression model can lead to significant biases. This paper applies a new classification method, called robust Bayes classifier, that does not rely on any particular assumption about the pattern of missing data and compares it to the traditional median imputation approach using a database of 324 Intensive Care Unit patients
    • …
    corecore