16 research outputs found

    To select or to weigh: a comparative study of linear combination schemes for superparent-one-dependence estimators

    Get PDF
    We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether, 16 model selection and weighing schemes, 58 benchmark data sets, and various statistical tests are employed. This paper's main contributions are threefold. First, it formally presents each scheme's definition, rationale, and time complexity and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers bias-variance analysis for each scheme's classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms which have an immediate and significant impact on real-world applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets

    “Old Dogs” Can Learn Ultrasound

    No full text

    Not So Naive Bayes: Aggregating One-Dependence Estimators

    No full text

    Learning for anytime classification

    No full text
    Many on-line applications of machine learning require that the learned classifiers complete classification within strict real-time constraints. In consequence, efficient classifiers such as naive Bayes (NB) are often employed that can complete the required classification tasks even under peak computational loads. While NB provides acceptable accuracy, more computationally intensive approaches can improve thereon. The current paper explores techniques that utilize any additional computational resources available at classification time to improve upon the prediction accuracy of NB. This is achieved by augmenting NB with a sequence of super-parent one-dependence estimators. As many of these are evaluated as possible within the available computational resources and the resulting set of probability estimates aggregated to produce a final prediction. The algorithm is demonstrated to provide consistent improvements in accuracy as computational resources are increased

    Learning by extrapolation from marginal to full-multivariate probability distributions: Decreasingly naive Bayesian classification

    No full text
    Averaged n-Dependence Estimators (AnDE) is an approach to probabilistic classification learning that learns by extrapolation from marginal to full-multivariate probability distributions. It utilizes a single parameter that transforms the approach between a low-variance high-bias learner (Naive Bayes) and a high-variance low-bias learner with Bayes optimal asymptotic error. It extends the underlying strategy of Averaged One-Dependence Estimators (AODE), which relaxes the Naive Bayes independence assumption while retaining many of Naive Bayes’ desirable computational and theoretical properties. AnDE further relaxes the independence assumption by generalizing AODE to higher-levels of dependence. Extensive experimental evaluation shows that the bias-variance trade-off for Averaged 2-Dependence Estimators results in strong predictive accuracy over a wide range of data sets. It has training time linear with respect to the number of examples, learns in a single pass through the training data, supports incremental learning, handles directly missing values, and is robust in the face of noise. Beyond the practical utility of its lower-dimensional variants, AnDE is of interest in that it demonstrates that it is possible to create low-bias high-variance generative learners and suggests strategies for developing even more powerful classifiers

    Ting Classifying under computational resource constraints: Anytime classification using probabilistic estimators

    No full text
    In many online applications of machine learning, the computational resources available for classification will vary from time to time. Most techniques are designed to operate within the constraints of the minimum expected resources and fail to utilize further resources when they are available. We propose a novel anytime classification algorithm, anytime averaged probabilistic estimators (AAPE), which is capable of delivering strong prediction accuracy with little CPU time and utilizing additional CPU time to increase classification accuracy. The idea is to run an ordered sequence of very efficient Bayesian probabilistic estimators (single improvement steps) until classification time runs out. Theoretical studies and empirical validations reveal that by properly identifying, ordering, invoking and ensembling single improvement steps, AAPE is able to accomplish accurate classification whenever it is interrupted. It is also able to output class probability estimates beyond simple 0/1-loss classifications, as well as adeptly handle incremental learning
    corecore