23,527 research outputs found
Statistical unfolding of elementary particle spectra: Empirical Bayes estimation and bias-corrected uncertainty quantification
We consider the high energy physics unfolding problem where the goal is to
estimate the spectrum of elementary particles given observations distorted by
the limited resolution of a particle detector. This important statistical
inverse problem arising in data analysis at the Large Hadron Collider at CERN
consists in estimating the intensity function of an indirectly observed Poisson
point process. Unfolding typically proceeds in two steps: one first produces a
regularized point estimate of the unknown intensity and then uses the
variability of this estimator to form frequentist confidence intervals that
quantify the uncertainty of the solution. In this paper, we propose forming the
point estimate using empirical Bayes estimation which enables a data-driven
choice of the regularization strength through marginal maximum likelihood
estimation. Observing that neither Bayesian credible intervals nor standard
bootstrap confidence intervals succeed in achieving good frequentist coverage
in this problem due to the inherent bias of the regularized point estimate, we
introduce an iteratively bias-corrected bootstrap technique for constructing
improved confidence intervals. We show using simulations that this enables us
to achieve nearly nominal frequentist coverage with only a modest increase in
interval length. The proposed methodology is applied to unfolding the boson
invariant mass spectrum as measured in the CMS experiment at the Large Hadron
Collider.Comment: Published at http://dx.doi.org/10.1214/15-AOAS857 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org). arXiv admin note:
substantial text overlap with arXiv:1401.827
Inference on a Distribution from Noisy Draws
We consider a situation where the distribution of a random variable is being
estimated by the empirical distribution of noisy measurements of that variable.
This is common practice in, for example, teacher value-added models and other
fixed-effect models for panel data. We use an asymptotic embedding where the
noise shrinks with the sample size to calculate the leading bias in the
empirical distribution arising from the presence of noise. The leading bias in
the empirical quantile function is equally obtained. These calculations are new
in the literature, where only results on smooth functionals such as the mean
and variance have been derived. Given a closed-form expression for the bias,
bias-corrected estimator of the distribution function and quantile function can
be constructed. We provide both analytical and jackknife corrections that
recenter the limit distribution and yield confidence intervals with correct
coverage in large samples. These corrections are non-parametric and easy to
implement. Our approach can be connected to corrections for selection bias and
shrinkage estimation and is to be contrasted with deconvolution. Simulation
results confirm the much-improved sampling behavior of the corrected
estimators.Comment: 24 pages main text, 22 pages appendix (including references
An alternative marginal likelihood estimator for phylogenetic models
Bayesian phylogenetic methods are generating noticeable enthusiasm in the
field of molecular systematics. Many phylogenetic models are often at stake and
different approaches are used to compare them within a Bayesian framework. The
Bayes factor, defined as the ratio of the marginal likelihoods of two competing
models, plays a key role in Bayesian model selection. We focus on an
alternative estimator of the marginal likelihood whose computation is still a
challenging problem. Several computational solutions have been proposed none of
which can be considered outperforming the others simultaneously in terms of
simplicity of implementation, computational burden and precision of the
estimates. Practitioners and researchers, often led by available software, have
privileged so far the simplicity of the harmonic mean estimator (HM) and the
arithmetic mean estimator (AM). However it is known that the resulting
estimates of the Bayesian evidence in favor of one model are biased and often
inaccurate up to having an infinite variance so that the reliability of the
corresponding conclusions is doubtful. Our new implementation of the
generalized harmonic mean (GHM) idea recycles MCMC simulations from the
posterior, shares the computational simplicity of the original HM estimator,
but, unlike it, overcomes the infinite variance issue. The alternative
estimator is applied to simulated phylogenetic data and produces fully
satisfactory results outperforming those simple estimators currently provided
by most of the publicly available software
Improving population-specific allele frequency estimates by adapting supplemental data: an empirical Bayes approach
Estimation of the allele frequency at genetic markers is a key ingredient in
biological and biomedical research, such as studies of human genetic variation
or of the genetic etiology of heritable traits. As genetic data becomes
increasingly available, investigators face a dilemma: when should data from
other studies and population subgroups be pooled with the primary data? Pooling
additional samples will generally reduce the variance of the frequency
estimates; however, used inappropriately, pooled estimates can be severely
biased due to population stratification. Because of this potential bias, most
investigators avoid pooling, even for samples with the same ethnic background
and residing on the same continent. Here, we propose an empirical Bayes
approach for estimating allele frequencies of single nucleotide polymorphisms.
This procedure adaptively incorporates genotypes from related samples, so that
more similar samples have a greater influence on the estimates. In every
example we have considered, our estimator achieves a mean squared error (MSE)
that is smaller than either pooling or not, and sometimes substantially
improves over both extremes. The bias introduced is small, as is shown by a
simulation study that is carefully matched to a real data example. Our method
is particularly useful when small groups of individuals are genotyped at a
large number of markers, a situation we are likely to encounter in a
genome-wide association study.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS121 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Empirical Bayes selection of wavelet thresholds
This paper explores a class of empirical Bayes methods for level-dependent
threshold selection in wavelet shrinkage. The prior considered for each wavelet
coefficient is a mixture of an atom of probability at zero and a heavy-tailed
density. The mixing weight, or sparsity parameter, for each level of the
transform is chosen by marginal maximum likelihood. If estimation is carried
out using the posterior median, this is a random thresholding procedure; the
estimation can also be carried out using other thresholding rules with the same
threshold. Details of the calculations needed for implementing the procedure
are included. In practice, the estimates are quick to compute and there is
software available. Simulations on the standard model functions show excellent
performance, and applications to data drawn from various fields of application
are used to explore the practical performance of the approach. By using a
general result on the risk of the corresponding marginal maximum likelihood
approach for a single sequence, overall bounds on the risk of the method are
found subject to membership of the unknown function in one of a wide range of
Besov classes, covering also the case of f of bounded variation. The rates
obtained are optimal for any value of the parameter p in (0,\infty],
simultaneously for a wide range of loss functions, each dominating the L_q norm
of the \sigmath derivative, with \sigma\ge0 and 0<q\le2.Comment: Published at http://dx.doi.org/10.1214/009053605000000345 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The functional subdivision of the visual brain : Is there a real illusion effect on action? A multi-lab replication study
Acknowledgements We thank Brian Roberts and Mike Harris for responding to our questions regarding their paper; Zoltan Dienes for advice on Bayes factors; Denise Fischer, Melanie Römer, Ioana Stanciu, Aleksandra Romanczuk, Stefano Uccelli, Nuria Martos Sánchez, and Rosa MarĂa Beño Ruiz de la Sierra for help collecting data; Eva Viviani for managing data collection in Parma. We thank Maurizio Gentilucci for letting us use his lab, and the Centro Intradipartimentale Mente e Cervello (CIMeC), University of Trento, and especially Francesco Pavani for lending us his motion tracking equipment. We thank Rachel Foster for proofreading. KKK was supported by a Ph.D. scholarship as part of a grant to VHF within the International Graduate Research Training Group on Cross-Modal Interaction in Natural and Artificial Cognitive Systems (CINACS; DFG IKG-1247) and TS by a grant (DFG – SCHE 735/3-1); both from the German Research Council.Peer reviewedPostprin
- …