25,176 research outputs found
Improving Simulation Efficiency of MCMC for Inverse Modeling of Hydrologic Systems with a Kalman-Inspired Proposal Distribution
Bayesian analysis is widely used in science and engineering for real-time
forecasting, decision making, and to help unravel the processes that explain
the observed data. These data are some deterministic and/or stochastic
transformations of the underlying parameters. A key task is then to summarize
the posterior distribution of these parameters. When models become too
difficult to analyze analytically, Monte Carlo methods can be used to
approximate the target distribution. Of these, Markov chain Monte Carlo (MCMC)
methods are particularly powerful. Such methods generate a random walk through
the parameter space and, under strict conditions of reversibility and
ergodicity, will successively visit solutions with frequency proportional to
the underlying target density. This requires a proposal distribution that
generates candidate solutions starting from an arbitrary initial state. The
speed of the sampled chains converging to the target distribution deteriorates
rapidly, however, with increasing parameter dimensionality. In this paper, we
introduce a new proposal distribution that enhances significantly the
efficiency of MCMC simulation for highly parameterized models. This proposal
distribution exploits the cross-covariance of model parameters, measurements
and model outputs, and generates candidate states much alike the analysis step
in the Kalman filter. We embed the Kalman-inspired proposal distribution in the
DREAM algorithm during burn-in, and present several numerical experiments with
complex, high-dimensional or multi-modal target distributions. Results
demonstrate that this new proposal distribution can greatly improve simulation
efficiency of MCMC. Specifically, we observe a speed-up on the order of 10-30
times for groundwater models with more than one-hundred parameters
The CMA Evolution Strategy: A Tutorial
This tutorial introduces the CMA Evolution Strategy (ES), where CMA stands
for Covariance Matrix Adaptation. The CMA-ES is a stochastic, or randomized,
method for real-parameter (continuous domain) optimization of non-linear,
non-convex functions. We try to motivate and derive the algorithm from
intuitive concepts and from requirements of non-linear, non-convex search in
continuous domain.Comment: ArXiv e-prints, arXiv:1604.xxxx
Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles
We present a canonical way to turn any smooth parametric family of
probability distributions on an arbitrary search space into a
continuous-time black-box optimization method on , the
\emph{information-geometric optimization} (IGO) method. Invariance as a design
principle minimizes the number of arbitrary choices. The resulting \emph{IGO
flow} conducts the natural gradient ascent of an adaptive, time-dependent,
quantile-based transformation of the objective function. It makes no
assumptions on the objective function to be optimized.
The IGO method produces explicit IGO algorithms through time discretization.
It naturally recovers versions of known algorithms and offers a systematic way
to derive new ones. The cross-entropy method is recovered in a particular case,
and can be extended into a smoothed, parametrization-independent maximum
likelihood update (IGO-ML). For Gaussian distributions on , IGO
is related to natural evolution strategies (NES) and recovers a version of the
CMA-ES algorithm. For Bernoulli distributions on , we recover the
PBIL algorithm. From restricted Boltzmann machines, we obtain a novel algorithm
for optimization on . All these algorithms are unified under a
single information-geometric optimization framework.
Thanks to its intrinsic formulation, the IGO method achieves invariance under
reparametrization of the search space , under a change of parameters of the
probability distributions, and under increasing transformations of the
objective function.
Theory strongly suggests that IGO algorithms have minimal loss in diversity
during optimization, provided the initial diversity is high. First experiments
using restricted Boltzmann machines confirm this insight. Thus IGO seems to
provide, from information theory, an elegant way to spontaneously explore
several valleys of a fitness landscape in a single run.Comment: Final published versio
Two Procedures for Robust Monitoring of Probability Distributions of Economic Data Streams induced by Depth Functions
Data streams (streaming data) consist of transiently observed, evolving in
time, multidimensional data sequences that challenge our computational and/or
inferential capabilities. In this paper we propose user friendly approaches for
robust monitoring of selected properties of unconditional and conditional
distribution of the stream basing on depth functions. Our proposals are robust
to a small fraction of outliers and/or inliers but sensitive to a regime change
of the stream at the same time. Their implementations are available in our free
R package DepthProc.Comment: Operations Research and Decisions, vol. 25, No. 1, 201
Efficient Sequential Monte-Carlo Samplers for Bayesian Inference
In many problems, complex non-Gaussian and/or nonlinear models are required
to accurately describe a physical system of interest. In such cases, Monte
Carlo algorithms are remarkably flexible and extremely powerful approaches to
solve such inference problems. However, in the presence of a high-dimensional
and/or multimodal posterior distribution, it is widely documented that standard
Monte-Carlo techniques could lead to poor performance. In this paper, the study
is focused on a Sequential Monte-Carlo (SMC) sampler framework, a more robust
and efficient Monte Carlo algorithm. Although this approach presents many
advantages over traditional Monte-Carlo methods, the potential of this emergent
technique is however largely underexploited in signal processing. In this work,
we aim at proposing some novel strategies that will improve the efficiency and
facilitate practical implementation of the SMC sampler specifically for signal
processing applications. Firstly, we propose an automatic and adaptive strategy
that selects the sequence of distributions within the SMC sampler that
minimizes the asymptotic variance of the estimator of the posterior
normalization constant. This is critical for performing model selection in
modelling applications in Bayesian signal processing. The second original
contribution we present improves the global efficiency of the SMC sampler by
introducing a novel correction mechanism that allows the use of the particles
generated through all the iterations of the algorithm (instead of only
particles from the last iteration). This is a significant contribution as it
removes the need to discard a large portion of the samples obtained, as is
standard in standard SMC methods. This will improve estimation performance in
practical settings where computational budget is important to consider.Comment: arXiv admin note: text overlap with arXiv:1303.3123 by other author
Bayesian computation via empirical likelihood
Approximate Bayesian computation (ABC) has become an essential tool for the
analysis of complex stochastic models when the likelihood function is
numerically unavailable. However, the well-established statistical method of
empirical likelihood provides another route to such settings that bypasses
simulations from the model and the choices of the ABC parameters (summary
statistics, distance, tolerance), while being convergent in the number of
observations. Furthermore, bypassing model simulations may lead to significant
time savings in complex models, for instance those found in population
genetics. The BCel algorithm we develop in this paper also provides an
evaluation of its own performance through an associated effective sample size.
The method is illustrated using several examples, including estimation of
standard distributions, time series, and population genetics models.Comment: 21 pages, 12 figures, revised version of the previous version with a
new titl
- …