218 research outputs found
Phase randomisation: a convergence diagnostic test for MCMC
Most MCMC users address the convergence problem by applying diagnostic tools to the output produced by running their samplers. Potentially useful diagnostics may be borrowed from diverse areas such as time series. One such method is phase randomisation. The aim of this paper is to describe this method in the context of MCMC, summarise its characteristics, and contrast its performance with those of the more common diagnostic tests for MCMC. It is observed that the new tool contributes information about third and higher order cumulant behaviour which is important in characterising certain forms of nonlinearity and nonstationarity.Convergence diagnostics; higher cumulants; Markov Chain Monte Carlo; non-linear time series; stationarity; surrogate series
Model choice versus model criticism
The new perspectives on ABC and Bayesian model criticisms presented in
Ratmann et al.(2009) are challenging standard approaches to Bayesian model
choice. We discuss here some issues arising from the authors' approach,
including prior influence, model assessment and criticism, and the meaning of
error in ABC.Comment: This is a comment on the recent paper by Ratmann, Andrieu, Wiuf, and
Richardson (PNAS, 106), submitted too late for PNAS to consider i
In praise of the referee
There has been a lively debate in many fields, including statistics and
related applied fields such as psychology and biomedical research, on possible
reforms of the scholarly publishing system. Currently, referees contribute so
much to improve scientific papers, both directly through constructive criticism
and indirectly through the threat of rejection. We discuss ways in which new
approaches to journal publication could continue to make use of the valuable
efforts of peer reviewers.Comment: 13 page
Modelling Survival Data to Account for Model Uncertainty: A Single Model or Model Averaging?
This study considered the problem of predicting survival, based on three alternative models: a single Weibull, a\ud
mixture of Weibulls and a cure model. Instead of the common procedure of choosing a single ???best??? model, where\ud
???best??? is defined in terms of goodness of fit to the data, a Bayesian model averaging (BMA) approach was adopted to\ud
account for model uncertainty. This was illustrated using a case study in which the aim was the description of\ud
lymphoma cancer survival with covariates given by phenotypes and gene expression. The results of this study indicate\ud
that if the sample size is sufficiently large, one of the three models emerge as having highest probability given the\ud
data, as indicated by the goodness of fit measure; the Bayesian information criterion (BIC). However, when the sample\ud
size was reduced, no single model was revealed as ???best???, suggesting that a BMA approach would be appropriate.\ud
Although a BMA approach can compromise on goodness of fit to the data (when compared to the true model), it can\ud
provide robust predictions and facilitate more detailed investigation of the relationships between gene expression\ud
and patient survival
Bayesian computation via empirical likelihood
Approximate Bayesian computation (ABC) has become an essential tool for the
analysis of complex stochastic models when the likelihood function is
numerically unavailable. However, the well-established statistical method of
empirical likelihood provides another route to such settings that bypasses
simulations from the model and the choices of the ABC parameters (summary
statistics, distance, tolerance), while being convergent in the number of
observations. Furthermore, bypassing model simulations may lead to significant
time savings in complex models, for instance those found in population
genetics. The BCel algorithm we develop in this paper also provides an
evaluation of its own performance through an associated effective sample size.
The method is illustrated using several examples, including estimation of
standard distributions, time series, and population genetics models.Comment: 21 pages, 12 figures, revised version of the previous version with a
new titl
Proposer selection in EIP-7251
Immediate settlement, or single-slot finality (SSF), is a long-term goal for
Ethereum. The growing active validator set size is placing an increasing
computational burden on the network, making SSF more challenging. EIP-7251 aims
to reduce the number of validators by giving stakers the option to merge
existing validators. Key to the success of this proposal therefore is whether
stakers choose to merge their validators once EIP-7251 is implemented. It is
natural to assume stakers participate only if they anticipate greater expected
utility (risk-adjusted returns) as a single large validator. In this paper, we
focus on one of the duties that a validator performs, viz. being the proposer
for the next block. This duty can be quite lucrative, but happens infrequently.
Based on previous analysis, we may assume that EIP-7251 implies no change to
the security of the protocol. We confirm that the probability of a validator
being selected as block proposer is equivalent under each consolidation regime.
This result ensures that the decision of one staker to merge has no impact on
the opportunity of another to propose the next block, in turn ensuring there is
no major systemic change to the economics of the protocol with respect to
proposer selection.Comment: 15 page
Bayesian nonparametric dependent model for partially replicated data: the influence of fuel spills on species diversity
International audienceWe introduce a dependent Bayesian nonparametric model for the probabilistic modeling of membership of subgroups in a community based on partially replicated data. The focus here is on species-by-site data, i.e. community data where observations at different sites are classified in distinct species. Our aim is to study the impact of additional covariates, for instance environmental variables, on the data structure, and in particular on the community diversity. To that purpose, we introduce dependence a priori across the covariates, and show that it improves posterior inference. We use a dependent version of the Griffiths-Engen-McCloskey distribution defined via the stick-breaking construction. This distribution is obtained by transforming a Gaussian process whose covariance function controls the desired dependence. The resulting posterior distribution is sampled by Markov chain Monte Carlo. We illustrate the application of our model to a soil microbial dataset acquired across a hydrocarbon contamination gradient at the site of a fuel spill in Antarctica. This method allows for inference on a number of quantities of interest in ecotoxicology, such as diversity or effective concentrations, and is broadly applicable to the general problem of communities response to environmental variables
Developing the atlas of cancer in Queensland: methodological issues
Background: Achieving health equity has been identified as a major challenge, both internationally and within Australia. Inequalities in cancer outcomes are well documented, and must be quantified before they can be addressed. One method of portraying geographical variation in data uses maps. Recently we have produced thematic maps showing the geographical variation in cancer incidence and survival across Queensland, Australia. This article documents the decisions and rationale used in producing these maps, with the aim to assist others in producing chronic disease atlases. Methods: Bayesian hierarchical models were used to produce the estimates. Justification for the cancers chosen, geographical areas used, modelling method, outcome measures mapped, production of the adjacency matrix, assessment of convergence, sensitivity analyses performed and determination of significant geographical variation is provided. Conclusions: Although careful consideration of many issues is required, chronic disease atlases are a useful tool for assessing and quantifying geographical inequalities. In addition they help focus research efforts to investigate why the observed inequalities exist, which in turn inform advocacy, policy, support and education programs designed to reduce these inequalities
A meta-analytic assessment of a Thyroglobulin marker for marbling in beef cattle
A meta-analysis was undertaken reporting on the association between a polymorphism in the Thyroglobulin gene (TG5) and marbling in beef cattle. A Bayesian hierarchical model was adopted, with alternative representations assessed through sensitivity analysis. Based on the overall posterior means and posterior probabilities, there is substantial support for an additive association between the TG5 marker and marbling. The marker effect was also assessed across various breed groups, with each group displaying a high probability of positive association between the T allele and marbling. The WinBUGS program code used to simulate the model is included as an Appendix available online at
Detecting Spatial Autocorrelation for a Small Number of Areas: a practical example
Moran's I is commonly used to detect spatial autocorrelation in spatial data. However, Moran's I may lead to underestimating spatial dependence when used for a small number of areas. This led to the development of Modified Moran’s I, which is designed to work when there are few areas. In this paper, both methods will be presented. Many R programs enable calculating Moran's I, but to date, none have been available for calculating Modified Moran's I. This paper aims to present both methods and provide the R code for calculating Modified Moran's I, with an application to a case study of dengue fever across 14 regions in Makassar, Indonesia
- …
