51 research outputs found
Selection of proposal distributions for generalized importance sampling estimators
The standard importance sampling (IS) estimator, generally does not work well
in examples involving simultaneous inference on several targets as the
importance weights can take arbitrarily large values making the estimator
highly unstable. In such situations, alternative generalized IS estimators
involving samples from multiple proposal distributions are preferred. Just like
the standard IS, the success of these multiple IS estimators crucially depends
on the choice of the proposal distributions. The selection of these proposal
distributions is the focus of this article. We propose three methods based on
(i) a geometric space filling coverage criterion, (ii) a minimax variance
approach, and (iii) a maximum entropy approach. The first two methods are
applicable to any multi-proposal IS estimator, whereas the third approach is
described in the context of Doss's (2010) two-stage IS estimator. For the first
method we propose a suitable measure of coverage based on the symmetric
Kullback-Leibler divergence, while the second and third approaches use
estimates of asymptotic variances of Doss's (2010) IS estimator and Geyer's
(1994) reverse logistic estimator, respectively. Thus, we provide consistent
spectral variance estimators for these asymptotic variances. The proposed
methods for selecting proposal densities are illustrated using various detailed
examples
MCMC for GLMMs
Generalized linear mixed models (GLMMs) are often used for analyzing
correlated non-Gaussian data. The likelihood function in a GLMM is available
only as a high dimensional integral, and thus closed-form inference and
prediction are not possible for GLMMs. Since the likelihood is not available in
a closed-form, the associated posterior densities in Bayesian GLMMs are also
intractable. Generally, Markov chain Monte Carlo (MCMC) algorithms are used for
conditional simulation in GLMMs and exploring these posterior densities. In
this article, we present different state of the art MCMC algorithms for fitting
GLMMs. These MCMC algorithms include efficient data augmentation strategies, as
well as diffusions based and Hamiltonian dynamics based methods. The Langevin
and Hamiltonian Monte Carlo methods presented here are applicable to any GLMMs,
and are illustrated using three most popular GLMMs, namely, the logistic and
probit GLMMs for binomial data and the Poisson-log GLMM for count data. We also
present efficient data augmentation algorithms for probit and logistic GLMMs
Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modeling
The reversible Markov chains that drive the data augmentation (DA) and
sandwich algorithms define self-adjoint operators whose spectra encode the
convergence properties of the algorithms. When the target distribution has
uncountable support, as is nearly always the case in practice, it is generally
quite difficult to get a handle on these spectra. We show that, if the
augmentation space is finite, then (under regularity conditions) the operators
defined by the DA and sandwich chains are compact, and the spectra are finite
subsets of . Moreover, we prove that the spectrum of the sandwich
operator dominates the spectrum of the DA operator in the sense that the
ordered elements of the former are all less than or equal to the corresponding
elements of the latter. As a concrete example, we study a widely used DA
algorithm for the exploration of posterior densities associated with Bayesian
mixture models [J. Roy. Statist. Soc. Ser. B 56 (1994) 363--375]. In
particular, we compare this mixture DA algorithm with an alternative algorithm
proposed by Fr\"{u}hwirth-Schnatter [J. Amer. Statist. Assoc. 96 (2001)
194--209] that is based on random label switching.Comment: Published in at http://dx.doi.org/10.1214/11-STS365 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Estimation and prediction for spatial generalized linear mixed models with parametric links via reparameterized importance sampling
Spatial generalized linear mixed models (SGLMMs) are popular for analyzing non-Gaussian spatial data. These models assume a prescribed link function that relates the underlying spatial field with the mean response. There are circumstances, such as when the data contain outlying observations, where the use of a prescribed link function can result in poor fit, which can be improved by using a parametric link function. Some popular link functions, such as the Box-Cox, are unsuitable because they are inconsistent with the Gaussian assumption of the spatial field. We present sensible choices of parametric link functions which possess desirable properties. It is important to estimate the parameters of the link function, rather than assume a known value. To that end, we present a generalized importance sampling (GIS) estimator based on multiple Markov chains for empirical Bayes analysis of SGLMMs. The GIS estimator, although more efficient than the simple importance sampling, can be highly variable when used to estimate the parameters of certain link functions. Using suitable reparameterizations of the Monte Carlo samples, we propose modified GIS estimators that do not suffer from high variability. We use Laplace approximation for choosing the multiple importance densities in the GIS estimator. Finally, we develop a methodology for selecting models with appropriate link function family, which extends to choosing a spatial correlation function as well. We present an ensemble prediction of the mean response by appropriately weighting the estimates from different models. The proposed methodology is illustrated using simulated and real data examples
- …