3,883 research outputs found

    Dynamic Bayesian Combination of Multiple Imperfect Classifiers

    Get PDF
    Classifier combination methods need to make best use of the outputs of multiple, imperfect classifiers to enable higher accuracy classifications. In many situations, such as when human decisions need to be combined, the base decisions can vary enormously in reliability. A Bayesian approach to such uncertain combination allows us to infer the differences in performance between individuals and to incorporate any available prior knowledge about their abilities when training data is sparse. In this paper we explore Bayesian classifier combination, using the computationally efficient framework of variational Bayesian inference. We apply the approach to real data from a large citizen science project, Galaxy Zoo Supernovae, and show that our method far outperforms other established approaches to imperfect decision combination. We go on to analyse the putative community structure of the decision makers, based on their inferred decision making strategies, and show that natural groupings are formed. Finally we present a dynamic Bayesian classifier combination approach and investigate the changes in base classifier performance over time.Comment: 35 pages, 12 figure

    Patterns of Scalable Bayesian Inference

    Full text link
    Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward

    Differential geometric MCMC methods and applications

    Get PDF
    This thesis presents novel Markov chain Monte Carlo methodology that exploits the natural representation of a statistical model as a Riemannian manifold. The methods developed provide generalisations of the Metropolis-adjusted Langevin algorithm and the Hybrid Monte Carlo algorithm for Bayesian statistical inference, and resolve many shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlation structure. The performance of these Riemannian manifold Markov chain Monte Carlo algorithms is rigorously assessed by performing Bayesian inference on logistic regression models, log-Gaussian Cox point process models, stochastic volatility models, and both parameter and model level inference of dynamical systems described by nonlinear differential equations

    Estimating Gene Interactions Using Information Theoretic Functionals

    No full text
    With an abundance of data resulting from high-throughput technologies, like DNA microarrays, a race has been on the last few years, to determine the structures and functions of genes and their products, the proteins. Inference of gene interactions, lies in the core of these efforts. In all this activity, three important research issues have emerged. First, in much of the current literature on gene regulatory networks, dependencies among variables in our case genes - are assumed to be linear in nature, when in fact, in real-life scenarios this is seldom the case. This disagreement leads to systematic deviation and biased evaluation. Secondly, although the problem of undersampling, features in every piece of work as one of the major causes for poor results, in practice it is overlooked and rarely addressed explicitly. Finally, inference of network structures, although based on rigid mathematical foundations and computational optimizations, often displays poor fitness values and biologically unrealistic link structures, due - to a large extend - to the discovery of pairwise only interactions. In our search for robust, nonlinear measures of dependency, we advocate that mutual information and related information theoretic functionals (conditional mutual information, total correlation) are possibly the most suitable candidates to capture both linear and nonlinear interactions between variables, and resolve higher order dependencies. To address these issues, we researched and implemented under a common framework, a selection nonparametric estimators of mutual information for continuous variables. The focus of their assessment was, their robustness to the limited sample sizes and their expansibility to higher dimensions - important for the detection of more complex interaction structures. Two different assessment scenaria were performed, one with simulated data and one with bootstrapping the estimators in state-of-the-art network inference algorithms and monitor their predictive power and sensitivity. The tests revealed that, in small sample size regimes, there is a significant difference in the performance of different estimators, and naive methods such as uniform binning, gave consistently poor results compared with more sophisticated methods. Finally, a custom, modular mechanism is proposed, for the inference of gene interactions, targeting the identi cation of some of the most common substructures in genetic networks, that we believe will help improve accuracy and predictability scores

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Universal rank-order transform to extract signals from noisy data

    Get PDF
    We introduce an ordinate method for noisy data analysis, based solely on rank information and thus insensitive to outliers. The method is nonparametric and objective, and the required data processing is parsimonious. The main ingredients include a rank-order data matrix and its transform to a stable form, which provide linear trends in excellent agreement with least squares regression, despite the loss of magnitude information. A group symmetry orthogonal decomposition of the 2D rank-order transform for iid (white) noise is further ordered by principal component analysis. This two-step procedure provides a noise “etalon” used to characterize arbitrary stationary stochastic processes. The method readily distinguishes both the Ornstein-Uhlenbeck process and chaos generated by the logistic map from white noise. Ranking within randomness differs fundamentally from that in deterministic chaos and signals, thus forming the basis for signal detection. To further illustrate the breadth of applications, we apply this ordinate method to the canonical nonlinear parameter estimation problem of two-species radioactive decay, outperforming special-purpose least squares software. We demonstrate that the method excels when extracting trends in heavy-tailed noise and, unlike the Thiele-Sen estimator, is not limited to linear regression. A simple expression is given that yields a close approximation for signal extraction of an underlying, generally nonlinear signal
    • …
    corecore