Search CORE

99,147 research outputs found

Random effects compound Poisson model to represent data with extra zeros

Author: Etienne Marie-Pierre
Hugues Benoit
Jacques Bernier
Parent Eric
Publication venue
Publication date: 01/01/2009
Field of study

This paper describes a compound Poisson-based random effects structure for modeling zero-inflated data. Data with large proportion of zeros are found in many fields of applied statistics, for example in ecology when trying to model and predict species counts (discrete data) or abundance distributions (continuous data). Standard methods for modeling such data include mixture and two-part conditional models. Conversely to these methods, the stochastic models proposed here behave coherently with regards to a change of scale, since they mimic the harvesting of a marked Poisson process in the modeling steps. Random effects are used to account for inhomogeneity. In this paper, model design and inference both rely on conditional thinking to understand the links between various layers of quantities : parameters, latent variables including random effects and zero-inflated observations. The potential of these parsimonious hierarchical models for zero-inflated data is exemplified using two marine macroinvertebrate abundance datasets from a large scale scientific bottom-trawl survey. The EM algorithm with a Monte Carlo step based on importance sampling is checked for this model structure on a simulated dataset : it proves to work well for parameter estimation but parameter values matter when re-assessing the actual coverage level of the confidence regions far from the asymptotic conditions.Comment: 4

arXiv.org e-Print Archive

HAL Descartes

Fuzzy Supernova Templates I: Classification

Author: Aldering
Barbary
Barris
Blondin
Cappellaro
Di Carlo
Dubois
Dubois
Garnavich
George
Goldhaber
Gregory
Gänsicke
Hamacher
Hamuy
Hatano
Hicken
Holtzman
Homeier
Howell
Hsiao
Jha
Jha
Jha
John L. Tonry
Johnson
Klir
Krisciunas
Krisciunas
Krisciunas
Kuznetsova
Lair
Leonard
Lewis
Lira
Meikle
Miknaitis
Nugent
Patat
Perlmutter
Poznanski
Qiu
Reindl
Riess
Riess
Riess
Sako
Sollerman
Steven A. Rodney
Stritzinger
Sullivan
Suntzeff
Tonry
Tsvetkov
Tsvetkov
Wang
Wood-Vasey
Yokoo
Yoshii
Publication venue: 'IOP Publishing'
Publication date: 19/10/2009
Field of study

Modern supernova (SN) surveys are now uncovering stellar explosions at rates that far surpass what the world's spectroscopic resources can handle. In order to make full use of these SN datasets, it is necessary to use analysis methods that depend only on the survey photometry. This paper presents two methods for utilizing a set of SN light curve templates to classify SN objects. In the first case we present an updated version of the Bayesian Adaptive Template Matching program (BATM). To address some shortcomings of that strictly Bayesian approach, we introduce a method for Supernova Ontology with Fuzzy Templates (SOFT), which utilizes Fuzzy Set Theory for the definition and combination of SN light curve models. For well-sampled light curves with a modest signal to noise ratio (S/N>10), the SOFT method can correctly separate thermonuclear (Type Ia) SNe from core collapse SNe with 98% accuracy. In addition, the SOFT method has the potential to classify supernovae into sub-types, providing photometric identification of very rare or peculiar explosions. The accuracy and precision of the SOFT method is verified using Monte Carlo simulations as well as real SN light curves from the Sloan Digital Sky Survey and the SuperNova Legacy Survey. In a subsequent paper the SOFT method is extended to address the problem of parameter estimation, providing estimates of redshift, distance, and host galaxy extinction without any spectroscopy.Comment: 26 pages, 12 figures. Accepted to Ap

arXiv.org e-Print Archive

Crossref

Methods for Bayesian power spectrum inference with galaxy surveys

Author: Jasche Jens
Wandelt Benjamin D.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2013
Field of study

We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves over previous Bayesian methods by performing a joint inference of the three dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate sub samples. The method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal to noise regimes by using a deterministic reversible jump algorithm. We test our method on an artificial mock galaxy survey, emulating characteristic features of the Sloan Digital Sky Survey data release 7, such as its survey geometry and luminosity dependent biases. These tests demonstrate the numerical feasibility of our large scale Bayesian inference frame work when the parameter space has millions of dimensions. The method reveals and correctly treats the anti-correlation between bias amplitudes and power spectrum, which are not taken into account in current approaches to power spectrum estimation, a 20 percent effect across large ranges in k-space. In addition, the method results in constrained realizations of density fields obtained without assuming the power spectrum or bias parameters in advance

arXiv.org e-Print Archive

HAL-INSU

Shrinkage Estimation of the Power Spectrum Covariance Matrix

Author: Adrian C. Pope
Bond
Chen
Efron
Evrard
Hamilton
István Szapudi
Ledoit
Lewis
Neyrinck
Schäfer
Smith
Stein
Szapudi
Szapudi
Publication venue: 'Wiley'
Publication date: 22/07/2008
Field of study

We seek to improve estimates of the power spectrum covariance matrix from a limited number of simulations by employing a novel statistical technique known as shrinkage estimation. The shrinkage technique optimally combines an empirical estimate of the covariance with a model (the target) to minimize the total mean squared error compared to the true underlying covariance. We test this technique on N-body simulations and evaluate its performance by estimating cosmological parameters. Using a simple diagonal target, we show that the shrinkage estimator significantly outperforms both the empirical covariance and the target individually when using a small number of simulations. We find that reducing noise in the covariance estimate is essential for properly estimating the values of cosmological parameters as well as their confidence intervals. We extend our method to the jackknife covariance estimator and again find significant improvement, though simulations give better results. Even for thousands of simulations we still find evidence that our method improves estimation of the covariance matrix. Because our method is simple, requires negligible additional numerical effort, and produces superior results, we always advocate shrinkage estimation for the covariance of the power spectrum and other large-scale structure measurements when purely theoretical modeling of the covariance is insufficient.Comment: 9 pages, 7 figures (1 new), MNRAS, accepted. Changes to match accepted version, including an additional explanatory section with 1 figur

arXiv.org e-Print Archive

Crossref

Evaluation of advanced optimisation methods for estimating Mixed Logit models

Author: Bastin F
Cirillo C
Hess S
Publication venue: 'Transportation Research Board'
Publication date: 01/01/2004
Field of study

The performances of different simulation-based estimation techniques for mixed logit modeling are evaluated. A quasi-Monte Carlo method (modified Latin hypercube sampling) is compared with a Monte Carlo algorithm with dynamic accuracy. The classic Broyden-Fletcher-Goldfarb-Shanno (BFGS) optimization algorithm line-search approach and trust region methods, which have proved to be extremely powerful in nonlinear programming, are also compared. Numerical tests are performed on two real data sets: stated preference data for parking type collected in the United Kingdom, and revealed preference data for mode choice collected as part of a German travel diary survey. Several criteria are used to evaluate the approximation quality of the log likelihood function and the accuracy of the results and the associated estimation runtime. Results suggest that the trust region approach outperforms the BFGS approach and that Monte Carlo methods remain competitive with quasi-Monte Carlo methods in high-dimensional problems, especially when an adaptive optimization algorithm is used

White Rose Research Online

Repository of the University of Namur

Subsampling MCMC - An introduction for the survey statistician

Author: Dang Khue-Dung
Kohn Robert
Quiroz Matias
Tran Minh-Ngoc
Villani Mattias
Publication venue
Publication date: 20/09/2018
Field of study

The rapid development of computing power and efficient Markov Chain Monte Carlo (MCMC) simulation algorithms have revolutionized Bayesian statistics, making it a highly practical inference method in applied work. However, MCMC algorithms tend to be computationally demanding, and are particularly slow for large datasets. Data subsampling has recently been suggested as a way to make MCMC methods scalable on massively large data, utilizing efficient sampling schemes and estimators from the survey sampling literature. These developments tend to be unknown by many survey statisticians who traditionally work with non-Bayesian methods, and rarely use MCMC. Our article explains the idea of data subsampling in MCMC by reviewing one strand of work, Subsampling MCMC, a so called pseudo-marginal MCMC approach to speeding up MCMC through data subsampling. The review is written for a survey statistician without previous knowledge of MCMC methods since our aim is to motivate survey sampling experts to contribute to the growing Subsampling MCMC literature.Comment: Accepted for publication in Sankhya A. Previous uploaded version contained a bug in generating the figures and reference

arXiv.org e-Print Archive

OPUS - University of Technology Sydney