96 research outputs found
Bayesian nonparametric analysis of reversible Markov chains
We introduce a three-parameter random walk with reinforcement, called the
scheme, which generalizes the linearly edge reinforced
random walk to uncountable spaces. The parameter smoothly tunes the
scheme between this edge reinforced random walk and the
classical exchangeable two-parameter Hoppe urn scheme, while the parameters
and modulate how many states are typically visited. Resorting
to de Finetti's theorem for Markov chains, we use the
scheme to define a nonparametric prior for Bayesian analysis of reversible
Markov chains. The prior is applied in Bayesian nonparametric inference for
species sampling problems with data generated from a reversible Markov chain
with an unknown transition kernel. As a real example, we analyze data from
molecular dynamics simulations of protein folding.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1102 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Recommended from our members
A comparison of bayesian adaptive randomization and multi-stage designs for multi-arm clinical trials
Interpretable Model Summaries Using the Wasserstein Distance
Statistical models often include thousands of parameters. However, large
models decrease the investigator's ability to interpret and communicate the
estimated parameters. Reducing the dimensionality of the parameter space in the
estimation phase is a commonly used approach, but less work has focused on
selecting subsets of the parameters for interpreting the estimated model --
especially in settings such as Bayesian inference and model averaging.
Importantly, many models do not have straightforward interpretations and create
another layer of obfuscation. To solve this gap, we introduce a new method that
uses the Wasserstein distance to identify a low-dimensional interpretable model
projection. After the estimation of complex models, users can budget how many
parameters they wish to interpret and the proposed generates a simplified model
of the desired dimension minimizing the distance to the full model. We provide
simulation results to illustrate the method and apply it to cancer datasets
False discovery rates in somatic mutation studies of cancer
The purpose of cancer genome sequencing studies is to determine the nature
and types of alterations present in a typical cancer and to discover genes
mutated at high frequencies. In this article we discuss statistical methods for
the analysis of somatic mutation frequency data generated in these studies. We
place special emphasis on a two-stage study design introduced by Sj\"{o}blom et
al. [Science 314 (2006) 268--274]. In this context, we describe and compare
statistical methods for constructing scores that can be used to prioritize
candidate genes for further investigation and to assess the statistical
significance of the candidates thus identified. Controversy has surrounded the
reliability of the false discovery rates estimates provided by the
approximations used in early cancer genome studies. To address these, we
develop a semiparametric Bayesian model that provides an accurate fit to the
data. We use this model to generate a large collection of realistic scenarios,
and evaluate alternative approaches on this collection. Our assessment is
impartial in that the model used for generating data is not used by any of the
approaches compared. And is objective, in that the scenarios are generated by a
model that fits data. Our results quantify the conservative control of the
false discovery rate with the Benjamini and Hockberg method compared to the
empirical Bayes approach and the multiple testing method proposed in Storey [J.
R. Stat. Soc. Ser. B Stat. Methodol. 64 (2002) 479--498]. Simulation results
also show a negligible departure from the target false discovery rate for the
methodology used in Sj\"{o}blom et al. [Science 314 (2006) 268--274].Comment: Published in at http://dx.doi.org/10.1214/10-AOAS438 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …