21,803 research outputs found
High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation
The ratio between two probability density functions is an important component
of various tasks, including selection bias correction, novelty detection and
classification. Recently, several estimators of this ratio have been proposed.
Most of these methods fail if the sample space is high-dimensional, and hence
require a dimension reduction step, the result of which can be a significant
loss of information. Here we propose a simple-to-implement, fully nonparametric
density ratio estimator that expands the ratio in terms of the eigenfunctions
of a kernel-based operator; these functions reflect the underlying geometry of
the data (e.g., submanifold structure), often leading to better estimates
without an explicit dimension reduction step. We show how our general framework
can be extended to address another important problem, the estimation of a
likelihood function in situations where that function cannot be
well-approximated by an analytical form. One is often faced with this situation
when performing statistical inference with data from the sciences, due the
complexity of the data and of the processes that generated those data. We
emphasize applications where using existing likelihood-free methods of
inference would be challenging due to the high dimensionality of the sample
space, but where our spectral series method yields a reasonable estimate of the
likelihood function. We provide theoretical guarantees and illustrate the
effectiveness of our proposed method with numerical experiments.Comment: With supplementary materia
Mining gold from implicit models to improve likelihood-free inference
Simulators often provide the best description of real-world phenomena.
However, they also lead to challenging inverse problems because the density
they implicitly define is often intractable. We present a new suite of
simulation-based inference techniques that go beyond the traditional
Approximate Bayesian Computation approach, which struggles in a
high-dimensional setting, and extend methods that use surrogate models based on
neural networks. We show that additional information, such as the joint
likelihood ratio and the joint score, can often be extracted from simulators
and used to augment the training data for these surrogate models. Finally, we
demonstrate that these new techniques are more sample efficient and provide
higher-fidelity inference than traditional methods.Comment: Code available at
https://github.com/johannbrehmer/simulator-mining-example . v2: Fixed typos.
v3: Expanded discussion, added Lotka-Volterra example. v4: Improved clarit
Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families
We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive
MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities
where classical HMC is not an option due to intractable gradients, KMC
adaptively learns the target's gradient structure by fitting an exponential
family model in a Reproducing Kernel Hilbert Space. Computational costs are
reduced by two novel efficient approximations to this gradient. While being
asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and
offers substantial mixing improvements over state-of-the-art gradient free
samplers. We support our claims with experimental studies on both toy and
real-world applications, including Approximate Bayesian Computation and
exact-approximate MCMC.Comment: 20 pages, 7 figure
Bayesian Conditional Density Filtering
We propose a Conditional Density Filtering (C-DF) algorithm for efficient
online Bayesian inference. C-DF adapts MCMC sampling to the online setting,
sampling from approximations to conditional posterior distributions obtained by
propagating surrogate conditional sufficient statistics (a function of data and
parameter estimates) as new data arrive. These quantities eliminate the need to
store or process the entire dataset simultaneously and offer a number of
desirable features. Often, these include a reduction in memory requirements and
runtime and improved mixing, along with state-of-the-art parameter inference
and prediction. These improvements are demonstrated through several
illustrative examples including an application to high dimensional compressed
regression. Finally, we show that C-DF samples converge to the target posterior
distribution asymptotically as sampling proceeds and more data arrives.Comment: 41 pages, 7 figures, 12 table
Bayesian optimisation for likelihood-free cosmological inference
Many cosmological models have only a finite number of parameters of interest,
but a very expensive data-generating process and an intractable likelihood
function. We address the problem of performing likelihood-free Bayesian
inference from such black-box simulation-based models, under the constraint of
a very limited simulation budget (typically a few thousand). To do so, we adopt
an approach based on the likelihood of an alternative parametric model.
Conventional approaches to approximate Bayesian computation such as
likelihood-free rejection sampling are impractical for the considered problem,
due to the lack of knowledge about how the parameters affect the discrepancy
between observed and simulated data. As a response, we make use of a strategy
previously developed in the machine learning literature (Bayesian optimisation
for likelihood-free inference, BOLFI), which combines Gaussian process
regression of the discrepancy to build a surrogate surface with Bayesian
optimisation to actively acquire training data. We extend the method by
deriving an acquisition function tailored for the purpose of minimising the
expected uncertainty in the approximate posterior density, in the parametric
approach. The resulting algorithm is applied to the problems of summarising
Gaussian signals and inferring cosmological parameters from the Joint
Lightcurve Analysis supernovae data. We show that the number of required
simulations is reduced by several orders of magnitude, and that the proposed
acquisition function produces more accurate posterior approximations, as
compared to common strategies.Comment: 16+9 pages, 12 figures. Matches PRD published version after minor
modification
Marginal Likelihood Estimation with the Cross-Entropy Method
We consider an adaptive importance sampling approach to estimating the marginal likelihood, a quantity that is fundamental in Bayesian model comparison and Bayesian model averaging. This approach is motivated by the difficulty of obtaining an accurate estimate through existing algorithms that use Markov chain Monte Carlo (MCMC) draws, where the draws are typically costly to obtain and highly correlated in high-dimensional settings. In contrast, we use the cross-entropy (CE) method, a versatile adaptive Monte Carlo algorithm originally developed for rare-event simulation. The main advantage of the importance sampling approach is that random samples can be obtained from some convenient density with little additional costs. As we are generating independent draws instead of correlated MCMC draws, the increase in simulation effort is much smaller should one wish to reduce the numerical standard error of the estimator. Moreover, the importance density derived via the CE method is in a well-defined sense optimal. We demonstrate the utility of the proposed approach by two empirical applications involving women's labor market participation and U.S. macroeconomic time series. In both applications the proposed CE method compares favorably to existing estimators
Ensemble Kalman methods for high-dimensional hierarchical dynamic space-time models
We propose a new class of filtering and smoothing methods for inference in
high-dimensional, nonlinear, non-Gaussian, spatio-temporal state-space models.
The main idea is to combine the ensemble Kalman filter and smoother, developed
in the geophysics literature, with state-space algorithms from the statistics
literature. Our algorithms address a variety of estimation scenarios, including
on-line and off-line state and parameter estimation. We take a Bayesian
perspective, for which the goal is to generate samples from the joint posterior
distribution of states and parameters. The key benefit of our approach is the
use of ensemble Kalman methods for dimension reduction, which allows inference
for high-dimensional state vectors. We compare our methods to existing ones,
including ensemble Kalman filters, particle filters, and particle MCMC. Using a
real data example of cloud motion and data simulated under a number of
nonlinear and non-Gaussian scenarios, we show that our approaches outperform
these existing methods
- …