4,212 research outputs found

    Multilevel Bayesian framework for modeling the production, propagation and detection of ultra-high energy cosmic rays

    Full text link
    Ultra-high energy cosmic rays (UHECRs) are atomic nuclei with energies over ten million times energies accessible to human-made particle accelerators. Evidence suggests that they originate from relatively nearby extragalactic sources, but the nature of the sources is unknown. We develop a multilevel Bayesian framework for assessing association of UHECRs and candidate source populations, and Markov chain Monte Carlo algorithms for estimating model parameters and comparing models by computing, via Chib's method, marginal likelihoods and Bayes factors. We demonstrate the framework by analyzing measurements of 69 UHECRs observed by the Pierre Auger Observatory (PAO) from 2004-2009, using a volume-complete catalog of 17 local active galactic nuclei (AGN) out to 15 megaparsecs as candidate sources. An early portion of the data ("period 1," with 14 events) was used by PAO to set an energy cut maximizing the anisotropy in period 1; the 69 measurements include this "tuned" subset, and subsequent "untuned" events with energies above the same cutoff. Also, measurement errors are approximately summarized. These factors are problematic for independent analyses of PAO data. Within the context of "standard candle" source models (i.e., with a common isotropic emission rate), and considering only the 55 untuned events, there is no significant evidence favoring association of UHECRs with local AGN vs. an isotropic background. The highest-probability associations are with the two nearest, adjacent AGN, Centaurus A and NGC 4945. If the association model is adopted, the fraction of UHECRs that may be associated is likely nonzero but is well below 50%. Our framework enables estimation of the angular scale for deflection of cosmic rays by cosmic magnetic fields; relatively modest scales of ≈ ⁣3∘\approx\!3^{\circ} to 30∘30^{\circ} are favored. Models that assign a large fraction of UHECRs to a single nearby source (e.g., Centaurus A) are ruled out unless very large deflection scales are specified a priori, and even then they are disfavored. However, including the period 1 data alters the conclusions significantly, and a simulation study supports the idea that the period 1 data are anomalous, presumably due to the tuning. Accurate and optimal analysis of future data will likely require more complete disclosure of the data.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS654 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Monte Carlo optimization approach for decentralized estimation networks under communication constraints

    Get PDF
    We consider designing decentralized estimation schemes over bandwidth limited communication links with a particular interest in the tradeoff between the estimation accuracy and the cost of communications due to, e.g., energy consumption. We take two classes of in–network processing strategies into account which yield graph representations through modeling the sensor platforms as the vertices and the communication links by edges as well as a tractable Bayesian risk that comprises the cost of transmissions and penalty for the estimation errors. This approach captures a broad range of possibilities for “online” processing of observations as well as the constraints imposed and enables a rigorous design setting in the form of a constrained optimization problem. Similar schemes as well as the structures exhibited by the solutions to the design problem has been studied previously in the context of decentralized detection. Under reasonable assumptions, the optimization can be carried out in a message passing fashion. We adopt this framework for estimation, however, the corresponding optimization schemes involve integral operators that cannot be evaluated exactly in general. We develop an approximation framework using Monte Carlo methods and obtain particle representations and approximate computational schemes for both classes of in–network processing strategies and their optimization. The proposed Monte Carlo optimization procedures operate in a scalable and efficient fashion and, owing to the non-parametric nature, can produce results for any distributions provided that samples can be produced from the marginals. In addition, this approach exhibits graceful degradation of the estimation accuracy asymptotically as the communication becomes more costly, through a parameterized Bayesian risk

    To P or not to P: on the evidential nature of P-values and their place in scientific inference

    Full text link
    The customary use of P-values in scientific research has been attacked as being ill-conceived, and the utility of P-values has been derided. This paper reviews common misconceptions about P-values and their alleged deficits as indices of experimental evidence and, using an empirical exploration of the properties of P-values, documents the intimate relationship between P-values and likelihood functions. It is shown that P-values quantify experimental evidence not by their numerical value, but through the likelihood functions that they index. Many arguments against the utility of P-values are refuted and the conclusion is drawn that P-values are useful indices of experimental evidence. The widespread use of P-values in scientific research is well justified by the actual properties of P-values, but those properties need to be more widely understood.Comment: 31 pages, 9 figures and R cod

    Bayesian Restricted Likelihood Methods: Conditioning on Insufficient Statistics in Bayesian Regression

    Full text link
    Bayesian methods have proven themselves to be successful across a wide range of scientific problems and have many well-documented advantages over competing methods. However, these methods run into difficulties for two major and prevalent classes of problems: handling data sets with outliers and dealing with model misspecification. We outline the drawbacks of previous solutions to both of these problems and propose a new method as an alternative. When working with the new method, the data is summarized through a set of insufficient statistics, targeting inferential quantities of interest, and the prior distribution is updated with the summary statistics rather than the complete data. By careful choice of conditioning statistics, we retain the main benefits of Bayesian methods while reducing the sensitivity of the analysis to features of the data not captured by the conditioning statistics. For reducing sensitivity to outliers, classical robust estimators (e.g., M-estimators) are natural choices for conditioning statistics. A major contribution of this work is the development of a data augmented Markov chain Monte Carlo (MCMC) algorithm for the linear model and a large class of summary statistics. We demonstrate the method on simulated and real data sets containing outliers and subject to model misspecification. Success is manifested in better predictive performance for data points of interest as compared to competing methods
    • 

    corecore