44 research outputs found

    On Evidence Weighted Mixture Classification

    Get PDF
    2005 Joint Annual Meeting of the Interface and the Classification Society of North America, St. Louis, Missouri, 8-12 June 2005Calculation of the marginal likelihood or evidence is a problem central to model selection and model averaging in a Bayesian framework. Many sampling methods, especially (Reversible Jump) Markov chain Monte Carlo techniques, have been devised to avoid explicit calculation of the evidence, but they are limited to models with a common parameterisation. It is desirable to extend model averaging to models with disparate architectures and parameterisations. In this paper we present a straightforward general computational scheme for calculating the evidence, applicable to any model for which samples can be drawn from the posterior distribution of parameters conditioned on the data. The scheme is demonstrated on a simple feature subset selection example

    Biclustering models for structured microarray data

    Get PDF
    ©2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.Microarrays have become a standard tool for investigating gene function and more complex microarray experiments are increasingly being conducted. For example, an experiment may involve samples from several groups or may investigate changes in gene expression over time for several subjects, leading to large three-way data sets. In response to this increase in data complexity, we propose some extensions to the plaid model, a biclustering method developed for the analysis of gene expression data. This model-based method lends itself to the incorporation of any additional structure such as external grouping or repeated measures. We describe how the extended models may be fitted and illustrate their use on real data

    Representing classifier confidence in the safety critical domain: an illustration from mortality prediction in trauma cases

    Get PDF
    Copyright © 2007 Springer Verlag. The final publication is available at link.springer.comThis work proposes a novel approach to assessing confidence measures for software classification systems in demanding applications such as those in the safety critical domain. Our focus is the Bayesian framework for developing a model-averaged probabilistic classifier implemented using Markov chain Monte Carlo (MCMC) and where appropriate its reversible jump variant (RJ-MCMC). Within this context we suggest a new technique, building on the reject region idea, to identify areas in feature space that are associated with "unsure" classification predictions. We term such areas "uncertainty envelopes" and they are defined in terms of the full characteristics of the posterior predictive density in different regions of the feature space. We argue this is more informative than use of a traditional reject region which considers only point estimates of predictive probabilities. Results from the method we propose are illustrated on synthetic data and also usefully applied to real life safety critical systems involving medical trauma data

    Computing with confidence: a Bayesian approach

    Get PDF
    Bayes’ rule is introduced as a coherent strategy for multiple recomputations of classifier system output, and thus as a basis for assessing the uncertainty associated with a particular system results --- i.e. a basis for confidence in the accuracy of each computed result. We use a Markov-Chain Monte Carlo method for efficient selection of recomputations to approximate the computationally intractable elements of the Bayesian approach. The estimate of the confidence to be placed in any classification result provides a sound basis for rejection of some classification results. We present uncertainty envelopes as one way to derive these confidence estimates from the population of recomputed results. We show that a coarse SURE or UNSURE confidence rating based on a threshold of agreed classifications works well, not only pinpointing those results that are reliable but also in indicating input data problems, such as corrupted or incomplete data, or application of an inadequate classifier model

    A Bayesian methodology for estimating uncertainty of decisions in safety-critical systems

    Get PDF
    Published as chapter in Frontiers in Artificial Intelligence and Applications. Volume 149, IOS Press Book, 2006. Integrated Intelligent Systems for Engineering Design. Edited by Xuan F. Zha, R.J. Howlett. ISBN 978-1-58603-675-1, pp. 82-96. This version deposited in arxiv.orghttp://arxiv.org/abs/1012.0322Uncertainty of decisions in safety-critical engineering applications can be estimated on the basis of the Bayesian Markov Chain Monte Carlo (MCMC) technique of averaging over decision models. The use of decision tree (DT) models assists experts to interpret causal relations and find factors of the uncertainty. Bayesian averaging also allows experts to estimate the uncertainty accurately when a priori information on the favored structure of DTs is available. Then an expert can select a single DT model, typically the Maximum a Posteriori model, for interpretation purposes. Unfortunately, a priori information on favored structure of DTs is not always available. For this reason, we suggest a new prior on DTs for the Bayesian MCMC technique. We also suggest a new procedure of selecting a single DT and describe an application scenario. In our experiments on the Short-Term Conflict Alert data our technique outperforms the existing Bayesian techniques in predictive accuracy of the selected single DTs.Supported by a grant from the EPSRC under the Critical Systems Program, grant GR/R24357/0

    A Bayesian Methodology for Estimating Uncertainty of Decisions in Safety-Critical Systems

    Get PDF
    In: Integrated Intelligent Systems for Engineering Design (editors: Zha, X.F. and Howlett, R.J.)Frontiers in Artificial Intelligence and Applications vol. 14

    Comparison of the Bayesian and Randomised Decision Tree Ensembles within an Uncertainty Envelope Technique

    Get PDF
    Copyright © 2006 Springer. The final publication is available at link.springer.comMultiple Classifier Systems (MCSs) allow evaluation of the uncertainty of classification outcomes that is of crucial importance for safety critical applications. The uncertainty of classification is determined by a trade-off between the amount of data available for training, the classifier diversity and the required performance. The interpretability of MCSs can also give useful information for experts responsible for making reliable classifications. For this reason Decision Trees (DTs) seem to be attractive classification models for experts. The required diversity of MCSs exploiting such classification models can be achieved by using two techniques, the Bayesian model averaging and the randomised DT ensemble. Both techniques have revealed promising results when applied to real-world problems. In this paper we experimentally compare the classification uncertainty of the Bayesian model averaging with a restarting strategy and the randomised DT ensemble on a synthetic dataset and some domain problems commonly used in the machine learning community. To make the Bayesian DT averaging feasible, we use a Markov Chain Monte Carlo technique. The classification uncertainty is evaluated within an Uncertainty Envelope technique dealing with the class posterior distribution and a given confidence probability. Exploring a full posterior distribution, this technique produces realistic estimates which can be easily interpreted in statistical terms. In our experiments we found out that the Bayesian DTs are superior to the randomised DT ensembles within the Uncertainty Envelope technique

    Estimating Classification Uncertainty of Bayesian Decision Tree Technique on Financial Data

    Get PDF
    Copyright © 2007 Springer. The final publication is available at link.springer.comBook title: Perception-based Data Mining and Decision Making in Economics and FinanceSummary Bayesian averaging over classification models allows the uncertainty of classification outcomes to be evaluated, which is of crucial importance for making reliable decisions in applications such as financial in which risks have to be estimated. The uncertainty of classification is determined by a trade-off between the amount of data available for training, the diversity of a classifier ensemble and the required performance. The interpretability of classification models can also give useful information for experts responsible for making reliable classifications. For this reason Decision Trees (DTs) seem to be attractive classification models. The required diversity of the DT ensemble can be achieved by using the Bayesian model averaging all possible DTs. In practice, the Bayesian approach can be implemented on the base of a Markov Chain Monte Carlo (MCMC) technique of random sampling from the posterior distribution. For sampling large DTs, the MCMC method is extended by Reversible Jump technique which allows inducing DTs under given priors. For the case when the prior information on the DT size is unavailable, the sweeping technique defining the prior implicitly reveals a better performance. Within this chapter we explore the classification uncertainty of the Bayesian MCMC techniques on some datasets from the StatLog Repository and real financial data. The classification uncertainty is compared within an Uncertainty Envelope technique dealing with the class posterior distribution and a given confidence probability. This technique provides realistic estimates of the classification uncertainty which can be easily interpreted in statistical terms with the aim of risk evaluation

    Experimental Comparison of Classification Uncertainty for Randomised and Bayesian Decision Tree Ensembles

    Get PDF
    Copyright © 2004 Springer-Verlag Berlin Heidelberg. The final publication is available at link.springer.comBook title: Intelligent Data Engineering and Automated Learning – IDEAL 20045th International Conference on Intelligent Data Engineering and Automated Learning – IDEAL 2004, Exeter, UK. August 25-27, 2004In this paper we experimentally compare the classification uncertainty of the randomised Decision Tree (DT) ensemble technique and the Bayesian DT technique with a restarting strategy on a synthetic dataset as well as on some datasets commonly used in the machine learning community. For quantitative evaluation of classification uncertainty, we use an Uncertainty Envelope dealing with the class posterior distribution and a given confidence probability. Counting the classifier outcomes, this technique produces feasible evaluations of the classification uncertainty. Using this technique in our experiments, we found that the Bayesian DT technique is superior to the randomised DT ensemble technique
    corecore