207 research outputs found
Metropolis-Hastings within Partially Collapsed Gibbs Samplers
The Partially Collapsed Gibbs (PCG) sampler offers a new strategy for
improving the convergence of a Gibbs sampler. PCG achieves faster convergence
by reducing the conditioning in some of the draws of its parent Gibbs sampler.
Although this can significantly improve convergence, care must be taken to
ensure that the stationary distribution is preserved. The conditional
distributions sampled in a PCG sampler may be incompatible and permuting their
order may upset the stationary distribution of the chain. Extra care must be
taken when Metropolis-Hastings (MH) updates are used in some or all of the
updates. Reducing the conditioning in an MH within Gibbs sampler can change the
stationary distribution, even when the PCG sampler would work perfectly if MH
were not used. In fact, a number of samplers of this sort that have been
advocated in the literature do not actually have the target stationary
distributions. In this article, we illustrate the challenges that may arise
when using MH within a PCG sampler and develop a general strategy for using
such updates while maintaining the desired stationary distribution. Theoretical
arguments provide guidance when choosing between different MH within PCG
sampling schemes. Finally we illustrate the MH within PCG sampler and its
computational advantage using several examples from our applied work
MNP: R Package for Fitting the Multinomial Probit Model
MNP is a publicly available R package that fits the Bayesian multinomial probit model via Markov chain Monte Carlo. The multinomial probit model is often used to analyze the discrete choices made by individuals recorded in survey data. Examples where the multinomial probit model may be useful include the analysis of product choice by consumers in market research and the analysis of candidate or party choice by voters in electoral studies. The MNP software can also fit the model with different choice sets for each individual, and complete or partial individual choice orderings of the available alternatives from the choice set. The estimation is based on the efficient marginal data augmentation algorithm that is developed by Imai and van Dyk (2005).
Cross-Fertilizing Strategies for Better EM Mountain Climbing and DA Field Exploration: A Graphical Guide Book
In recent years, a variety of extensions and refinements have been developed
for data augmentation based model fitting routines. These developments aim to
extend the application, improve the speed and/or simplify the implementation of
data augmentation methods, such as the deterministic EM algorithm for mode
finding and stochastic Gibbs sampler and other auxiliary-variable based methods
for posterior sampling. In this overview article we graphically illustrate and
compare a number of these extensions, all of which aim to maintain the
simplicity and computation stability of their predecessors. We particularly
emphasize the usefulness of identifying similarities between the deterministic
and stochastic counterparts as we seek more efficient computational strategies.
We also demonstrate the applicability of data augmentation methods for handling
complex models with highly hierarchical structure, using a high-energy
high-resolution spectral imaging model for data from satellite telescopes, such
as the Chandra X-ray Observatory.Comment: Published in at http://dx.doi.org/10.1214/09-STS309 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A method for comparing non-nested models with application to astrophysical searches for new physics
Searches for unknown physics and decisions between competing astrophysical
models to explain data both rely on statistical hypothesis testing. The usual
approach in searches for new physical phenomena is based on the statistical
Likelihood Ratio Test (LRT) and its asymptotic properties. In the common
situation, when neither of the two models under comparison is a special case of
the other i.e., when the hypotheses are non-nested, this test is not
applicable. In astrophysics, this problem occurs when two models that reside in
different parameter spaces are to be compared. An important example is the
recently reported excess emission in astrophysical -rays and the
question whether its origin is known astrophysics or dark matter. We develop
and study a new, simple, generally applicable, frequentist method and validate
its statistical properties using a suite of simulations studies. We exemplify
it on realistic simulated data of the Fermi-LAT -ray satellite, where
non-nested hypotheses testing appears in the search for particle dark matter.Comment: We welcome examples of non-nested models testing problem
A Repelling-Attracting Metropolis Algorithm for Multimodality
Although the Metropolis algorithm is simple to implement, it often has
difficulties exploring multimodal distributions. We propose the
repelling-attracting Metropolis (RAM) algorithm that maintains the
simple-to-implement nature of the Metropolis algorithm, but is more likely to
jump between modes. The RAM algorithm is a Metropolis-Hastings algorithm with a
proposal that consists of a downhill move in density that aims to make local
modes repelling, followed by an uphill move in density that aims to make local
modes attracting. The downhill move is achieved via a reciprocal Metropolis
ratio so that the algorithm prefers downward movement. The uphill move does the
opposite using the standard Metropolis ratio which prefers upward movement.
This down-up movement in density increases the probability of a proposed move
to a different mode. Because the acceptance probability of the proposal
involves a ratio of intractable integrals, we introduce an auxiliary variable
which creates a term in the acceptance probability that cancels with the
intractable ratio. Using several examples, we demonstrate the potential for the
RAM algorithm to explore a multimodal distribution more efficiently than a
Metropolis algorithm and with less tuning than is commonly required by
tempering-based methods
Preprocessing Solar Images while Preserving their Latent Structure
Telescopes such as the Atmospheric Imaging Assembly aboard the Solar Dynamics
Observatory, a NASA satellite, collect massive streams of high resolution
images of the Sun through multiple wavelength filters. Reconstructing
pixel-by-pixel thermal properties based on these images can be framed as an
ill-posed inverse problem with Poisson noise, but this reconstruction is
computationally expensive and there is disagreement among researchers about
what regularization or prior assumptions are most appropriate. This article
presents an image segmentation framework for preprocessing such images in order
to reduce the data volume while preserving as much thermal information as
possible for later downstream analyses. The resulting segmented images reflect
thermal properties but do not depend on solving the ill-posed inverse problem.
This allows users to avoid the Poisson inverse problem altogether or to tackle
it on each of 10 segments rather than on each of 10 pixels,
reducing computing time by a factor of 10. We employ a parametric
class of dissimilarities that can be expressed as cosine dissimilarity
functions or Hellinger distances between nonlinearly transformed vectors of
multi-passband observations in each pixel. We develop a decision theoretic
framework for choosing the dissimilarity that minimizes the expected loss that
arises when estimating identifiable thermal properties based on segmented
images rather than on a pixel-by-pixel basis. We also examine the efficacy of
different dissimilarities for recovering clusters in the underlying thermal
properties. The expected losses are computed under scientifically motivated
prior distributions. Two simulation studies guide our choices of dissimilarity
function. We illustrate our method by segmenting images of a coronal hole
observed on 26 February 2015
On methods for correcting for the look-elsewhere effect in searches for new physics
The search for new significant peaks over a energy spectrum often involves a
statistical multiple hypothesis testing problem. Separate tests of hypothesis
are conducted at different locations producing an ensemble of local p-values,
the smallest of which is reported as evidence for the new resonance.
Unfortunately, controlling the false detection rate (type I error rate) of such
procedures may lead to excessively stringent acceptance criteria. In the recent
physics literature, two promising statistical tools have been proposed to
overcome these limitations. In 2005, a method to "find needles in haystacks"
was introduced by Pilla et al. [1], and a second method was later proposed by
Gross and Vitells [2] in the context of the "look elsewhere effect" and trial
factors. We show that, for relatively small sample sizes, the former leads to
an artificial inflation of statistical power that stems from an increase in the
false detection rate, whereas the two methods exhibit similar performance for
large sample sizes. We apply the methods to realistic simulations of the Fermi
Large Area Telescope data, in particular the search for dark matter
annihilation lines. Further, we discuss the counter-intutive scenario where the
look-elsewhere corrections are more conservative than much more computationally
efficient corrections for multiple hypothesis testing. Finally, we provide
general guidelines for navigating the tradeoffs between statistical and
computational efficiency when selecting a statistical procedure for signal
detection
MNP: R Package for Fitting the Multinomial Probit Model
MNP is a publicly available R package that fits the Bayesian multinomial probit model via Markov chain Monte Carlo. The multinomial probit model is often used to analyze the discrete choices made by individuals recorded in survey data. Examples where the multinomial probit model may be useful include the analysis of product choice by consumers in market research and the analysis of candidate or party choice by voters in electoral studies. The MNP software can also fit the model with different choice sets for each individual, and complete or partial individual choice orderings of the available alternatives from the choice set. The estimation is based on the efficient marginal data augmentation algorithm that is developed by Imai and van Dyk (2005)
Detecting Unspecified Structure in Low-Count Images
Unexpected structure in images of astronomical sources often presents itself
upon visual inspection of the image, but such apparent structure may either
correspond to true features in the source or be due to noise in the data. This
paper presents a method for testing whether inferred structure in an image with
Poisson noise represents a significant departure from a baseline (null) model
of the image. To infer image structure, we conduct a Bayesian analysis of a
full model that uses a multiscale component to allow flexible departures from
the posited null model. As a test statistic, we use a tail probability of the
posterior distribution under the full model. This choice of test statistic
allows us to estimate a computationally efficient upper bound on a p-value that
enables us to draw strong conclusions even when there are limited computational
resources that can be devoted to simulations under the null model. We
demonstrate the statistical performance of our method on simulated images.
Applying our method to an X-ray image of the quasar 0730+257, we find
significant evidence against the null model of a single point source and
uniform background, lending support to the claim of an X-ray jet
- …