138,358 research outputs found
Semiparametric Cross Entropy for rare-event simulation
The Cross Entropy method is a well-known adaptive importance sampling method
for rare-event probability estimation, which requires estimating an optimal
importance sampling density within a parametric class. In this article we
estimate an optimal importance sampling density within a wider semiparametric
class of distributions. We show that this semiparametric version of the Cross
Entropy method frequently yields efficient estimators. We illustrate the
excellent practical performance of the method with numerical experiments and
show that for the problems we consider it typically outperforms alternative
schemes by orders of magnitude
Computationally Efficient Nonparametric Importance Sampling
The variance reduction established by importance sampling strongly depends on
the choice of the importance sampling distribution. A good choice is often hard
to achieve especially for high-dimensional integration problems. Nonparametric
estimation of the optimal importance sampling distribution (known as
nonparametric importance sampling) is a reasonable alternative to parametric
approaches.In this article nonparametric variants of both the self-normalized
and the unnormalized importance sampling estimator are proposed and
investigated. A common critique on nonparametric importance sampling is the
increased computational burden compared to parametric methods. We solve this
problem to a large degree by utilizing the linear blend frequency polygon
estimator instead of a kernel estimator. Mean square error convergence
properties are investigated leading to recommendations for the efficient
application of nonparametric importance sampling. Particularly, we show that
nonparametric importance sampling asymptotically attains optimal importance
sampling variance. The efficiency of nonparametric importance sampling
algorithms heavily relies on the computational efficiency of the employed
nonparametric estimator. The linear blend frequency polygon outperforms kernel
estimators in terms of certain criteria such as efficient sampling and
evaluation. Furthermore, it is compatible with the inversion method for sample
generation. This allows to combine our algorithms with other variance reduction
techniques such as stratified sampling. Empirical evidence for the usefulness
of the suggested algorithms is obtained by means of three benchmark integration
problems. As an application we estimate the distribution of the queue length of
a spam filter queueing system based on real data.Comment: 29 pages, 7 figure
Importance Sampling and its Optimality for Stochastic Simulation Models
We consider the problem of estimating an expected outcome from a stochastic
simulation model. Our goal is to develop a theoretical framework on importance
sampling for such estimation. By investigating the variance of an importance
sampling estimator, we propose a two-stage procedure that involves a regression
stage and a sampling stage to construct the final estimator. We introduce a
parametric and a nonparametric regression estimator in the first stage and
study how the allocation between the two stages affects the performance of the
final estimator. We analyze the variance reduction rates and derive oracle
properties of both methods. We evaluate the empirical performances of the
methods using two numerical examples and a case study on wind turbine
reliability evaluation.Comment: 37 pages, 6 figures, 2 tables. Accepted to the Electronic Journal of
Statistic
Computational aspects of Bayesian spectral density estimation
Gaussian time-series models are often specified through their spectral
density. Such models present several computational challenges, in particular
because of the non-sparse nature of the covariance matrix. We derive a fast
approximation of the likelihood for such models. We propose to sample from the
approximate posterior (that is, the prior times the approximate likelihood),
and then to recover the exact posterior through importance sampling. We show
that the variance of the importance sampling weights vanishes as the sample
size goes to infinity. We explain why the approximate posterior may typically
multi-modal, and we derive a Sequential Monte Carlo sampler based on an
annealing sequence in order to sample from that target distribution.
Performance of the overall approach is evaluated on simulated and real
datasets. In addition, for one real world dataset, we provide some numerical
evidence that a Bayesian approach to semi-parametric estimation of spectral
density may provide more reasonable results than its Frequentist counter-parts
Estimation and prediction for spatial generalized linear mixed models with parametric links via reparameterized importance sampling
Spatial generalized linear mixed models (SGLMMs) are popular for analyzing
non-Gaussian spatial data. These models assume a prescribed link function that
relates the underlying spatial field with the mean response. There are
circumstances, such as when the data contain outlying observations, where the
use of a prescribed link function can result in poor fit, which can be improved
by using a parametric link function. Some popular link functions, such as the
Box-Cox, are unsuitable because they are inconsistent with the Gaussian
assumption of the spatial field. We present sensible choices of parametric link
functions which possess desirable properties. It is important to estimate the
parameters of the link function, rather than assume a known value. To that end,
we present a generalized importance sampling (GIS) estimator based on multiple
Markov chains for empirical Bayes analysis of SGLMMs. The GIS estimator,
although more efficient than the simple importance sampling, can be highly
variable when used to estimate the parameters of certain link functions. Using
suitable reparameterizations of the Monte Carlo samples, we propose modified
GIS estimators that do not suffer from high variability. We use Laplace
approximation for choosing the multiple importance densities in the GIS
estimator. Finally, we develop a methodology for selecting models with
appropriate link function family, which extends to choosing a spatial
correlation function as well. We present an ensemble prediction of the mean
response by appropriately weighting the estimates from different models. The
proposed methodology is illustrated using simulated and real data examples
The transform likelihood ratio method for rare event simulation with heavy tails
We present a novel method, called the transform likelihood ratio (TLR) method, for estimation of rare event probabilities with heavy-tailed distributions. Via a simple transformation ( change of variables) technique the TLR method reduces the original rare event probability estimation with heavy tail distributions to an equivalent one with light tail distributions. Once this transformation has been established we estimate the rare event probability via importance sampling, using the classical exponential change of measure or the standard likelihood ratio change of measure. In the latter case the importance sampling distribution is chosen from the same parametric family as the transformed distribution. We estimate the optimal parameter vector of the importance sampling distribution using the cross-entropy method. We prove the polynomial complexity of the TLR method for certain heavy-tailed models and demonstrate numerically its high efficiency for various heavy-tailed models previously thought to be intractable. We also show that the TLR method can be viewed as a universal tool in the sense that not only it provides a unified view for heavy-tailed simulation but also can be efficiently used in simulation with light-tailed distributions. We present extensive simulation results which support the efficiency of the TLR method
Variational autoencoder with weighted samples for high-dimensional non-parametric adaptive importance sampling
Probability density function estimation with weighted samples is the main
foundation of all adaptive importance sampling algorithms. Classically, a
target distribution is approximated either by a non-parametric model or within
a parametric family. However, these models suffer from the curse of
dimensionality or from their lack of flexibility. In this contribution, we
suggest to use as the approximating model a distribution parameterised by a
variational autoencoder. We extend the existing framework to the case of
weighted samples by introducing a new objective function. The flexibility of
the obtained family of distributions makes it as expressive as a non-parametric
model, and despite the very high number of parameters to estimate, this family
is much more efficient in high dimension than the classical Gaussian or
Gaussian mixture families. Moreover, in order to add flexibility to the model
and to be able to learn multimodal distributions, we consider a learnable prior
distribution for the variational autoencoder latent variables. We also
introduce a new pre-training procedure for the variational autoencoder to find
good starting weights of the neural networks to prevent as much as possible the
posterior collapse phenomenon to happen. At last, we explicit how the resulting
distribution can be combined with importance sampling, and we exploit the
proposed procedure in existing adaptive importance sampling algorithms to draw
points from a target distribution and to estimate a rare event probability in
high dimension on two multimodal problems.Comment: 20 pages, 5 figure
- …