12,897 research outputs found
Bayesian Nonparametric Spectral Estimation
Spectral estimation (SE) aims to identify how the energy of a signal (e.g., a
time series) is distributed across different frequencies. This can become
particularly challenging when only partial and noisy observations of the signal
are available, where current methods fail to handle uncertainty appropriately.
In this context, we propose a joint probabilistic model for signals,
observations and spectra, where SE is addressed as an exact inference problem.
Assuming a Gaussian process prior over the signal, we apply Bayes' rule to find
the analytic posterior distribution of the spectrum given a set of
observations. Besides its expressiveness and natural account of spectral
uncertainty, the proposed model also provides a functional-form representation
of the power spectral density, which can be optimised efficiently. Comparison
with previous approaches, in particular against Lomb-Scargle, is addressed
theoretically and also experimentally in three different scenarios. Code and
demo available at https://github.com/GAMES-UChile/BayesianSpectralEstimation.Comment: 11 pages. In Advances in Neural Information Processing Systems, 201
Gradient Distribution Priors for Biomedical Image Processing
Ill-posed inverse problems are commonplace in biomedical image processing.
Their solution typically requires imposing prior knowledge about the latent
ground truth. While this regularizes the problem to an extent where it can be
solved, it also biases the result toward the expected. With inappropriate
priors harming more than they use, it remains unclear what prior to use for a
given practical problem. Priors are hence mostly chosen in an {\em ad hoc} or
empirical fashion. We argue here that the gradient distribution of
natural-scene images may provide a versatile and well-founded prior for
biomedical images. We provide motivation for this choice from different points
of view, and we fully validate the resulting prior for use on biomedical images
by showing its stability and correlation with image quality. We then provide a
set of simple parametric models for the resulting prior, leading to
straightforward (quasi-)convex optimization problems for which we provide
efficient solver algorithms. We illustrate the use of the present models and
solvers in a variety of common image-processing tasks, including contrast
enhancement, noise level estimation, denoising, blind deconvolution,
zooming/up-sampling, and dehazing. In all cases we show that the present method
leads to results that are comparable to or better than the state of the art;
always using the same, simple prior. We conclude by discussing the limitations
and possible interpretations of the prior.Comment: submitted to journa
Variational Inference over Non-differentiable Cardiac Simulators using Bayesian Optimization
Performing inference over simulators is generally intractable as their
runtime means we cannot compute a marginal likelihood. We develop a
likelihood-free inference method to infer parameters for a cardiac simulator,
which replicates electrical flow through the heart to the body surface. We
improve the fit of a state-of-the-art simulator to an electrocardiogram (ECG)
recorded from a real patient.Comment: Workshops on Deep Learning for Physical Sciences and Machine Learning
4 Health, NIPS 201
Universal Approximation of Edge Density in Large Graphs
In this paper, we present a novel way to summarize the structure of large
graphs, based on non-parametric estimation of edge density in directed
multigraphs. Following coclustering approach, we use a clustering of the
vertices, with a piecewise constant estimation of the density of the edges
across the clusters, and address the problem of automatically and reliably
inferring the number of clusters, which is the granularity of the coclustering.
We use a model selection technique with data-dependent prior and obtain an
exact evaluation criterion for the posterior probability of edge density
estimation models. We demonstrate, both theoretically and empirically, that our
data-dependent modeling technique is consistent, resilient to noise, valid non
asymptotically and asymptotically behaves as an universal approximator of the
true edge density in directed multigraphs. We evaluate our method using
artificial graphs and present its practical interest on real world graphs. The
method is both robust and scalable. It is able to extract insightful patterns
in the unsupervised learning setting and to provide state of the art accuracy
when used as a preparation step for supervised learning
A Bayesian spatial temporal mixtures approach to kinetic parametric images in dynamic Positron Emission Tomography
We present a fully Bayesian statistical approach to the problem of
compartmental modelling in the context of Positron Emission Tomography. We
cluster homogeneous region of interest and perform kinetic parameter estimation
simultaneously. A mixture modelling approach is adopted, incorporating both
spatial and temporal information based on reconstructed dynamic PET image. Our
modelling approach is flexible, and provides uncertainty estimates for the
estimated kinetic parameters. Crucially, the proposed method allows us to
determine the unknown number of clusters, which has a great impact on resulting
estimated kinetic parameters. We demonstrate our method on simulated dynamic
Myocardial PET data, and show that our method is superior to standard
curve-fitting approach.Comment: 30 page
A spatio-spectral hybridization for edge preservation and noisy image restoration via local parametric mixtures and Lagrangian relaxation
This paper investigates a fully unsupervised statistical method for edge
preserving image restoration and compression using a spatial decomposition
scheme. Smoothed maximum likelihood is used for local estimation of edge pixels
from mixture parametric models of local templates. For the complementary smooth
part the traditional L2-variational problem is solved in the Fourier domain
with Thin Plate Spline (TPS) regularization. It is well known that naive
Fourier compression of the whole image fails to restore a piece-wise smooth
noisy image satisfactorily due to Gibbs phenomenon. Images are interpreted as
relative frequency histograms of samples from bi-variate densities where the
sample sizes might be unknown. The set of discontinuities is assumed to be
completely unsupervised Lebesgue-null, compact subset of the plane in the
continuous formulation of the problem. Proposed spatial decomposition uses a
widely used topological concept, partition of unity. The decision on edge pixel
neighborhoods are made based on the multiple testing procedure of Holms.
Statistical summary of the final output is decomposed into two layers of
information extraction, one for the subset of edge pixels and the other for the
smooth region. Robustness is also demonstrated by applying the technique on
noisy degradation of clean images.Comment: 29 Pages, 13 figure
Determining the Number of Non-Spurious Arcs in a Learned DAG Model: Investigation of a Bayesian and a Frequentist Approach
In many application domains, such as computational biology, the goal of
graphical model structure learning is to uncover discrete relationships between
entities. For example, in our problem of interest concerning HIV vaccine
design, we want to infer which HIV peptides interact with which immune system
molecules (HLA molecules). For problems of this nature, we are interested in
determining the number of nonspurious arcs in a learned graphical model. We
describe both a Bayesian and frequentist approach to this problem. In the
Bayesian approach, we use the posterior distribution over model structures to
compute the expected number of true arcs in a learned model. In the frequentist
approach, we develop a method based on the concept of the False Discovery Rate.
On synthetic data sets generated from models similar to the ones learned, we
find that both the Bayesian and frequentist approaches yield accurate estimates
of the number of non-spurious arcs. In addition, we speculate that the
frequentist approach, which is non-parametric, may outperform the parametric
Bayesian approach in situations where the models learned are less
representative of the data. Finally, we apply the frequentist approach to our
problem of HIV vaccine design.Comment: Appears in Proceedings of the Twenty-Third Conference on Uncertainty
in Artificial Intelligence (UAI2007
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Model-based reinforcement learning (RL) algorithms can attain excellent
sample efficiency, but often lag behind the best model-free algorithms in terms
of asymptotic performance. This is especially true with high-capacity
parametric function approximators, such as deep networks. In this paper, we
study how to bridge this gap, by employing uncertainty-aware dynamics models.
We propose a new algorithm called probabilistic ensembles with trajectory
sampling (PETS) that combines uncertainty-aware deep network dynamics models
with sampling-based uncertainty propagation. Our comparison to state-of-the-art
model-based and model-free deep RL algorithms shows that our approach matches
the asymptotic performance of model-free algorithms on several challenging
benchmark tasks, while requiring significantly fewer samples (e.g., 8 and 125
times fewer samples than Soft Actor Critic and Proximal Policy Optimization
respectively on the half-cheetah task).Comment: NIPS 2018, video and code available at
https://sites.google.com/view/drl-in-a-handful-of-trials
Bayesian active learning for optimization and uncertainty quantification in protein docking
Motivation: Ab initio protein docking represents a major challenge for
optimizing a noisy and costly "black box"-like function in a high-dimensional
space. Despite progress in this field, there is no docking method available for
rigorous uncertainty quantification (UQ) of its solution quality (e.g.
interface RMSD or iRMSD).
Results: We introduce a novel algorithm, Bayesian Active Learning (BAL), for
optimization and UQ of such black-box functions and flexible protein docking.
BAL directly models the posterior distribution of the global optimum (or native
structures for protein docking) with active sampling and posterior estimation
iteratively feeding each other. Furthermore, we use complex normal modes to
represent a homogeneous Euclidean conformation space suitable for
high-dimension optimization and construct funnel-like energy models for
encounter complexes. Over a protein docking benchmark set and a CAPRI set
including homology docking, we establish that BAL significantly improve against
both starting points by rigid docking and refinements by particle swarm
optimization, providing for one third targets a top-3 near-native prediction.
BAL also generates tight confidence intervals with half range around 25% of
iRMSD and confidence level at 85%. Its estimated probability of a prediction
being native or not achieves binary classification AUROC at 0.93 and AUPRC over
0.60 (compared to 0.14 by chance); and also found to help ranking predictions.
To the best of our knowledge, this study represents the first uncertainty
quantification solution for protein docking, with theoretical rigor and
comprehensive assessment.
Source codes are available at https://github.com/Shen-Lab/BAL
Forecasting Turbulent Modes with Nonparametric Diffusion Models: Learning from noisy data
In this paper, we apply a recently developed nonparametric modeling approach,
the "diffusion forecast", to predict the time-evolution of Fourier modes of
turbulent dynamical systems. While the diffusion forecasting method assumes the
availability of a noise-free training data set observing the full state space
of the dynamics, in real applications we often have only partial observations
which are corrupted by noise. To alleviate these practical issues, following
the theory of embedology, the diffusion model is built using the
delay-embedding coordinates of the data. We show that this delay embedding
biases the geometry of the data in a way which extracts the most stable
component of the dynamics and reduces the influence of independent additive
observation noise. The resulting diffusion forecast model approximates the
semigroup solutions of the generator of the underlying dynamics in the limit of
large data and when the observation noise vanishes. As in any standard
forecasting problem, the forecasting skill depends crucially on the accuracy of
the initial conditions. We introduce a novel Bayesian method for filtering the
discrete-time noisy observations which works with the diffusion forecast to
determine the forecast initial densities.
Numerically, we compare this nonparametric approach with standard stochastic
parametric models on a wide-range of well-studied turbulent modes, including
the Lorenz-96 model in weakly chaotic to fully turbulent regimes and the
barotropic modes of a quasi-geostrophic model with baroclinic instabilities. We
show that when the only available data is the low-dimensional set of noisy
modes that are being modeled, the diffusion forecast is indeed competitive to
the perfect model
- …