Search CORE

3,883 research outputs found

Dynamic Bayesian Combination of Multiple Imperfect Classifiers

Author: A.P. Dawid
A.P. Dempster
C. Fox
G. Parisi
G.J. Bierman
M. Girvan
M. West
N.M. Law
P. Abbeel
R.K. Dash
S. Geman
S. Kullback
S. Lefkimmiatis
S.M. Lee
T. Fawcett
V.C. Raykar
W.R. Gilks
Publication venue
Publication date: 08/06/2012
Field of study

Classifier combination methods need to make best use of the outputs of multiple, imperfect classifiers to enable higher accuracy classifications. In many situations, such as when human decisions need to be combined, the base decisions can vary enormously in reliability. A Bayesian approach to such uncertain combination allows us to infer the differences in performance between individuals and to incorporate any available prior knowledge about their abilities when training data is sparse. In this paper we explore Bayesian classifier combination, using the computationally efficient framework of variational Bayesian inference. We apply the approach to real data from a large citizen science project, Galaxy Zoo Supernovae, and show that our method far outperforms other established approaches to imperfect decision combination. We go on to analyse the putative community structure of the decision makers, based on their inferred decision making strategies, and show that natural groupings are formed. Finally we present a dynamic Bayesian classifier combination approach and investigate the changes in base classifier performance over time.Comment: 35 pages, 12 figure

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Patterns of Scalable Bayesian Inference

Author: Adams Ryan P.
Angelino Elaine
Johnson Matthew James
Publication venue
Publication date: 01/01/2016
Field of study

Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward

arXiv.org e-Print Archive

Crossref

CERN Document Server

Differential geometric MCMC methods and applications

Author: Calderhead Ben
Publication venue
Publication date: 01/01/2011
Field of study

This thesis presents novel Markov chain Monte Carlo methodology that exploits the natural representation of a statistical model as a Riemannian manifold. The methods developed provide generalisations of the Metropolis-adjusted Langevin algorithm and the Hybrid Monte Carlo algorithm for Bayesian statistical inference, and resolve many shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlation structure. The performance of these Riemannian manifold Markov chain Monte Carlo algorithms is rigorously assessed by performing Bayesian inference on logistic regression models, log-Gaussian Cox point process models, stochastic volatility models, and both parameter and model level inference of dynamical systems described by nonlinear differential equations

Glasgow Theses Service

OpenGrey Repository

Recommended from our members

Enabling Resilience in Cyber-Physical-Human Water Infrastructures

Author: Han Qing
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Rapid urbanization and growth in urban populations have forced community-scale infrastructures (e.g., water, power and natural gas distribution systems, and transportation networks) to operate at their limits. Aging (and failing) infrastructures around the world are becoming increasingly vulnerable to operational degradation, extreme weather, natural disasters and cyber attacks/failures. These trends have wide-ranging socioeconomic consequences and raise public safety concerns. In this thesis, we introduce the notion of cyber-physical-human infrastructures (CPHIs) - smart community-scale infrastructures that bridge technologies with physical infrastructures and people. CPHIs are highly dynamic stochastic systems characterized by complex physical models that exhibit regionwide variability and uncertainty under disruptions. Failures in these distributed settings tend to be difficult to predict and estimate, and expensive to repair. Real-time fault identification is crucial to ensure continuity of lifeline services to customers at adequate levels of quality. Emerging smart community technologies have the potential to transform our failing infrastructures into robust and resilient future CPHIs.In this thesis, we explore one such CPHI - community water infrastructures. Current urban water infrastructures, that are decades (sometimes over a 100 years) old, encompass diverse geophysical regimes. Water stress concerns include the scarcity of supply and an increase in demand due to urbanization. Deterioration and damage to the infrastructure can disrupt water service; contamination events can result in economic and public health consequences. Unfortunately, little investment has gone into modernizing this key lifeline.To enhance the resilience of water systems, we propose an integrated middleware framework for quick and accurate identification of failures in complex water networks that exhibit uncertain behavior. Our proposed approach integrates IoT-based sensing, domain-specific models and simulations with machine learning methods to identify failures (pipe breaks, contamination events). The composition of techniques results in cost-accuracy-latency tradeoffs in fault identification, inherent in CPHIs due to the constraints imposed by cyber components, physical mechanics and human operators. Three key resilience problems are addressed in this thesis; isolation of multiple faults under a small number of failures, state estimation of the water systems under extreme events such as earthquakes, and contaminant source identification in water networks using human-in-the-loop based sensing. By working with real world water agencies (WSSC, DC and LADWP, LA), we first develop an understanding of operations of water CPHI systems. We design and implement a sensor-simulation-data integration framework AquaSCALE, and apply it to localize multiple concurrent pipe failures. We use a mixture of infrastructure measurements (i.e., historical and live water pressure/flow), environmental data (i.e., weather) and human inputs (i.e., twitter feeds), combined and enhanced with the domain model and supervised learning techniques to locate multiple failures at fine levels of granularity (individual pipeline level) with detection time reduced by orders of magnitude (from hours/days to minutes). We next consider the resilience of water infrastructures under extreme events (i.e., earthquakes) - the challenge here is the lack of apriori knowledge and the increased number and severity of damages to infrastructures. We present a graphical model based approach for efficient online state estimation, where the offline graph factorization partitions a given network into disjoint subgraphs, and the belief propagation based inference is executed on-the-fly in a distributed manner on those subgraphs. Our proposed approach can isolate 80% broken pipes and 99% loss-of-service to end-users during an earthquake.Finally, we address issues of water quality - today this is a human-in-the-loop process where operators need to gather water samples for lab tests. We incorporate the necessary abstractions with event processing methods into a workflow, which iteratively selects and refines the set of potential failure points via human-driven grab sampling. Our approach utilizes Hidden Markov Model based representations for event inference, along with reinforcement learning methods for further refining event locations and reducing the cost of human efforts.The proposed techniques are integrated into a middleware architecture, which enables components to communicate/collaborate with one another. We validate our approaches through a prototype implementation with multiple real-world water networks, supply-demand patterns from water utilities and policies set by the U.S. EPA. While our focus here is on water infrastructures in a community, the developed end-to-end solution is applicable to other infrastructures and community services which operate in disruptive and resource-constrained environments

eScholarship - University of California

Estimating Gene Interactions Using Information Theoretic Functionals

Author: Chan Georgia
Chan Georgia
Publication venue: Computing, Imperial College London
Publication date: 01/09/2009
Field of study

With an abundance of data resulting from high-throughput technologies, like DNA microarrays, a race has been on the last few years, to determine the structures and functions of genes and their products, the proteins. Inference of gene interactions, lies in the core of these efforts. In all this activity, three important research issues have emerged. First, in much of the current literature on gene regulatory networks, dependencies among variables in our case genes - are assumed to be linear in nature, when in fact, in real-life scenarios this is seldom the case. This disagreement leads to systematic deviation and biased evaluation. Secondly, although the problem of undersampling, features in every piece of work as one of the major causes for poor results, in practice it is overlooked and rarely addressed explicitly. Finally, inference of network structures, although based on rigid mathematical foundations and computational optimizations, often displays poor fitness values and biologically unrealistic link structures, due - to a large extend - to the discovery of pairwise only interactions. In our search for robust, nonlinear measures of dependency, we advocate that mutual information and related information theoretic functionals (conditional mutual information, total correlation) are possibly the most suitable candidates to capture both linear and nonlinear interactions between variables, and resolve higher order dependencies. To address these issues, we researched and implemented under a common framework, a selection nonparametric estimators of mutual information for continuous variables. The focus of their assessment was, their robustness to the limited sample sizes and their expansibility to higher dimensions - important for the detection of more complex interaction structures. Two different assessment scenaria were performed, one with simulated data and one with bootstrapping the estimators in state-of-the-art network inference algorithms and monitor their predictive power and sensitivity. The tests revealed that, in small sample size regimes, there is a significant difference in the performance of different estimators, and naive methods such as uniform binning, gave consistently poor results compared with more sophisticated methods. Finally, a custom, modular mechanism is proposed, for the inference of gene interactions, targeting the identi cation of some of the most common substructures in genetic networks, that we believe will help improve accuracy and predictability scores

Spiral - Imperial College Digital Repository

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

Universal rank-order transform to extract signals from noisy data

Author: Ierley Glenn
Kostinski Alexander
Publication venue: Digital Commons @ Michigan Tech
Publication date: 20/06/2019
Field of study

We introduce an ordinate method for noisy data analysis, based solely on rank information and thus insensitive to outliers. The method is nonparametric and objective, and the required data processing is parsimonious. The main ingredients include a rank-order data matrix and its transform to a stable form, which provide linear trends in excellent agreement with least squares regression, despite the loss of magnitude information. A group symmetry orthogonal decomposition of the 2D rank-order transform for iid (white) noise is further ordered by principal component analysis. This two-step procedure provides a noise “etalon” used to characterize arbitrary stationary stochastic processes. The method readily distinguishes both the Ornstein-Uhlenbeck process and chaos generated by the logistic map from white noise. Ranking within randomness differs fundamentally from that in deterministic chaos and signals, thus forming the basis for signal detection. To further illustrate the breadth of applications, we apply this ordinate method to the canonical nonlinear parameter estimation problem of two-species radioactive decay, outperforming special-purpose least squares software. We demonstrate that the method excels when extracting trends in heavy-tailed noise and, unlike the Thiele-Sen estimator, is not limited to linear regression. A simple expression is given that yields a close approximation for signal extraction of an underlying, generally nonlinear signal

arXiv.org e-Print Archive

Michigan Technological University

Directory of Open Access Journals