54,093 research outputs found
Approximate Bayesian Computation by Modelling Summary Statistics in a Quasi-likelihood Framework
Approximate Bayesian Computation (ABC) is a useful class of methods for
Bayesian inference when the likelihood function is computationally intractable.
In practice, the basic ABC algorithm may be inefficient in the presence of
discrepancy between prior and posterior. Therefore, more elaborate methods,
such as ABC with the Markov chain Monte Carlo algorithm (ABC-MCMC), should be
used. However, the elaboration of a proposal density for MCMC is a sensitive
issue and very difficult in the ABC setting, where the likelihood is
intractable. We discuss an automatic proposal distribution useful for ABC-MCMC
algorithms. This proposal is inspired by the theory of quasi-likelihood (QL)
functions and is obtained by modelling the distribution of the summary
statistics as a function of the parameters. Essentially, given a real-valued
vector of summary statistics, we reparametrize the model by means of a
regression function of the statistics on parameters, obtained by sampling from
the original model in a pilot-run simulation study. The QL theory is well
established for a scalar parameter, and it is shown that when the conditional
variance of the summary statistic is assumed constant, the QL has a closed-form
normal density. This idea of constructing proposal distributions is extended to
non constant variance and to real-valued parameter vectors. The method is
illustrated by several examples and by an application to a real problem in
population genetics.Comment: Published at http://dx.doi.org/10.1214/14-BA921 in the Bayesian
Analysis (http://projecteuclid.org/euclid.ba) by the International Society of
Bayesian Analysis (http://bayesian.org/
Non-linear regression models for Approximate Bayesian Computation
Approximate Bayesian inference on the basis of summary statistics is
well-suited to complex problems for which the likelihood is either
mathematically or computationally intractable. However the methods that use
rejection suffer from the curse of dimensionality when the number of summary
statistics is increased. Here we propose a machine-learning approach to the
estimation of the posterior density by introducing two innovations. The new
method fits a nonlinear conditional heteroscedastic regression of the parameter
on the summary statistics, and then adaptively improves estimation using
importance sampling. The new algorithm is compared to the state-of-the-art
approximate Bayesian methods, and achieves considerable reduction of the
computational burden in two examples of inference in statistical genetics and
in a queueing model.Comment: 4 figures; version 3 minor changes; to appear in Statistics and
Computin
Statistical and Computational Tradeoff in Genetic Algorithm-Based Estimation
When a Genetic Algorithm (GA), or a stochastic algorithm in general, is
employed in a statistical problem, the obtained result is affected by both
variability due to sampling, that refers to the fact that only a sample is
observed, and variability due to the stochastic elements of the algorithm. This
topic can be easily set in a framework of statistical and computational
tradeoff question, crucial in recent problems, for which statisticians must
carefully set statistical and computational part of the analysis, taking
account of some resource or time constraints. In the present work we analyze
estimation problems tackled by GAs, for which variability of estimates can be
decomposed in the two sources of variability, considering some constraints in
the form of cost functions, related to both data acquisition and runtime of the
algorithm. Simulation studies will be presented to discuss the statistical and
computational tradeoff question.Comment: 17 pages, 5 figure
Using neutral cline decay to estimate contemporary dispersal: a generic tool and its application to a major crop pathogen
Dispersal is a key parameter of adaptation, invasion and persistence. Yet standard population genetics inference methods hardly distinguish it from drift and many species cannot be studied by direct mark-recapture methods. Here, we introduce a method using rates of change in cline shapes for neutral markers to estimate contemporary dispersal. We apply it to the devastating banana pest Mycosphaerella fijiensis, a wind-dispersed fungus for which a secondary contact zone had previously been detected using landscape genetics tools. By tracking the spatio-temporal frequency change of 15 microsatellite markers, we find that σ, the standard deviation of parent–offspring dispersal distances, is 1.2 km/generation1/2. The analysis is further shown robust to a large range of dispersal kernels. We conclude that combining landscape genetics approaches to detect breaks in allelic frequencies with analyses of changes in neutral genetic clines offers a powerful way to obtain ecologically relevant estimates of dispersal in many species
Bayesian computation via empirical likelihood
Approximate Bayesian computation (ABC) has become an essential tool for the
analysis of complex stochastic models when the likelihood function is
numerically unavailable. However, the well-established statistical method of
empirical likelihood provides another route to such settings that bypasses
simulations from the model and the choices of the ABC parameters (summary
statistics, distance, tolerance), while being convergent in the number of
observations. Furthermore, bypassing model simulations may lead to significant
time savings in complex models, for instance those found in population
genetics. The BCel algorithm we develop in this paper also provides an
evaluation of its own performance through an associated effective sample size.
The method is illustrated using several examples, including estimation of
standard distributions, time series, and population genetics models.Comment: 21 pages, 12 figures, revised version of the previous version with a
new titl
- …