4,212 research outputs found
Multilevel Bayesian framework for modeling the production, propagation and detection of ultra-high energy cosmic rays
Ultra-high energy cosmic rays (UHECRs) are atomic nuclei with energies over
ten million times energies accessible to human-made particle accelerators.
Evidence suggests that they originate from relatively nearby extragalactic
sources, but the nature of the sources is unknown. We develop a multilevel
Bayesian framework for assessing association of UHECRs and candidate source
populations, and Markov chain Monte Carlo algorithms for estimating model
parameters and comparing models by computing, via Chib's method, marginal
likelihoods and Bayes factors. We demonstrate the framework by analyzing
measurements of 69 UHECRs observed by the Pierre Auger Observatory (PAO) from
2004-2009, using a volume-complete catalog of 17 local active galactic nuclei
(AGN) out to 15 megaparsecs as candidate sources. An early portion of the data
("period 1," with 14 events) was used by PAO to set an energy cut maximizing
the anisotropy in period 1; the 69 measurements include this "tuned" subset,
and subsequent "untuned" events with energies above the same cutoff. Also,
measurement errors are approximately summarized. These factors are problematic
for independent analyses of PAO data. Within the context of "standard candle"
source models (i.e., with a common isotropic emission rate), and considering
only the 55 untuned events, there is no significant evidence favoring
association of UHECRs with local AGN vs. an isotropic background. The
highest-probability associations are with the two nearest, adjacent AGN,
Centaurus A and NGC 4945. If the association model is adopted, the fraction of
UHECRs that may be associated is likely nonzero but is well below 50%. Our
framework enables estimation of the angular scale for deflection of cosmic rays
by cosmic magnetic fields; relatively modest scales of to
are favored. Models that assign a large fraction of UHECRs to a
single nearby source (e.g., Centaurus A) are ruled out unless very large
deflection scales are specified a priori, and even then they are disfavored.
However, including the period 1 data alters the conclusions significantly, and
a simulation study supports the idea that the period 1 data are anomalous,
presumably due to the tuning. Accurate and optimal analysis of future data will
likely require more complete disclosure of the data.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS654 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Monte Carlo optimization approach for decentralized estimation networks under communication constraints
We consider designing decentralized estimation schemes over bandwidth limited communication links with a particular interest in the tradeoff between the estimation accuracy and the cost of communications due to, e.g., energy
consumption. We take two classes of inânetwork processing strategies into account which yield graph representations through modeling the sensor platforms as the vertices and the communication links by edges as well as a tractable
Bayesian risk that comprises the cost of transmissions and penalty for the estimation errors. This approach captures a broad range of possibilities for âonlineâ processing of observations as well as the constraints imposed and enables a rigorous design setting in the form of a constrained optimization problem. Similar schemes as well as the structures exhibited by the solutions to the design problem has been studied previously in the context of decentralized detection. Under reasonable assumptions, the optimization can be carried out in a message passing fashion. We adopt this framework for estimation, however, the corresponding optimization schemes involve integral operators that cannot
be evaluated exactly in general. We develop an approximation framework using Monte Carlo methods and obtain particle representations and approximate computational schemes for both classes of inânetwork processing strategies
and their optimization. The proposed Monte Carlo optimization procedures operate in a scalable and efficient fashion and, owing to the non-parametric nature, can produce results for any distributions provided that samples can be
produced from the marginals. In addition, this approach exhibits graceful degradation of the estimation accuracy asymptotically as the communication becomes more costly, through a parameterized Bayesian risk
To P or not to P: on the evidential nature of P-values and their place in scientific inference
The customary use of P-values in scientific research has been attacked as
being ill-conceived, and the utility of P-values has been derided. This paper
reviews common misconceptions about P-values and their alleged deficits as
indices of experimental evidence and, using an empirical exploration of the
properties of P-values, documents the intimate relationship between P-values
and likelihood functions. It is shown that P-values quantify experimental
evidence not by their numerical value, but through the likelihood functions
that they index. Many arguments against the utility of P-values are refuted and
the conclusion is drawn that P-values are useful indices of experimental
evidence. The widespread use of P-values in scientific research is well
justified by the actual properties of P-values, but those properties need to be
more widely understood.Comment: 31 pages, 9 figures and R cod
Bayesian Restricted Likelihood Methods: Conditioning on Insufficient Statistics in Bayesian Regression
Bayesian methods have proven themselves to be successful across a wide range
of scientific problems and have many well-documented advantages over competing
methods. However, these methods run into difficulties for two major and
prevalent classes of problems: handling data sets with outliers and dealing
with model misspecification. We outline the drawbacks of previous solutions to
both of these problems and propose a new method as an alternative. When working
with the new method, the data is summarized through a set of insufficient
statistics, targeting inferential quantities of interest, and the prior
distribution is updated with the summary statistics rather than the complete
data. By careful choice of conditioning statistics, we retain the main benefits
of Bayesian methods while reducing the sensitivity of the analysis to features
of the data not captured by the conditioning statistics. For reducing
sensitivity to outliers, classical robust estimators (e.g., M-estimators) are
natural choices for conditioning statistics. A major contribution of this work
is the development of a data augmented Markov chain Monte Carlo (MCMC)
algorithm for the linear model and a large class of summary statistics. We
demonstrate the method on simulated and real data sets containing outliers and
subject to model misspecification. Success is manifested in better predictive
performance for data points of interest as compared to competing methods
- âŠ