38 research outputs found
Bernstein - von Mises theorem and misspecified models: a review
This is a review of asymptotic and non-asymptotic behaviour of Bayesian
methods under model specification. In particular we focus on consistency, i.e.
convergence of the posterior distribution to the point mass at the best
parametric approximation to the true model, and conditions for it to be locally
Gaussian around this point. For well specified regular models, variance of the
Gaussian approximation coincides with the Fisher information, making Bayesian
inference asymptotically efficient. In this review, we discuss how this is
affected by model misspecification. We also discuss approaches to adjust
Bayesian inference to make it asymptotically efficient under model
misspecification
Bayesian Inverse Problems with Heterogeneous Variance
We consider inverse problems in Hilbert spaces contaminated by Gaussian
noise, and use a Bayesian approach to find its regularised smooth solution. We
consider the so called conjugate diagonal setting where the covariance
operators of the noise and of the prior are diagnolisable in the orthogonal
bases associated with the forward operator of the inverse problem. Firstly, we
derive the minimax rate of convergence in such problems with known covariance
operator of the noise, showing that in the case of heterogeneous variance the
ill posed inverse problem can become self regularised in some cases when the
eigenvalues of the variance operator decay to zero, achieving parametric rate
of convergence; as far as we are aware, this is a striking novel result that
have not been observed before in nonparametric problems. Secondly, we give a
general expression of the rate of contraction of the posterior distribution in
case of known noise covariance operator in case the noise level is small, for a
given prior distribution. We also investigate when this contraction rate
coincides with the optimal rate in the minimax sense which is typically used as
a benchmark for studying the posterior contraction rates. We apply our results
to known variance operators with polynomially decreasing or increasing
eigenvalues as an example. We also discuss when the plug in estimator of the
eigenvalues of the covariance operator of the noise does not affect the rate of
the contraction of the posterior distribution of the signal. We show that
plugging in the maximum marginal likelihood estimator of the prior scaling
parameter leads to the optimal posterior contraction rate, adaptively. Effect
of the choice of the prior parameters on the contraction in such models is
illustrated on simulated data with Volterra operator
Shared Differential Clustering across Single-cell RNA Sequencing Datasets with the Hierarchical Dirichlet Process
Single-cell RNA sequencing (scRNA-seq) is powerful technology that allows researchers to understand gene expression patterns at the single-cell level. However, analysing scRNA-seq data is challenging due to issues and biases in data collection. In this work, we construct an integrated Bayesian model that simultaneously addresses normalization, imputation and batch effects and also nonparametrically clusters cells into groups across multiple datasets. A Gibbs sampler based on a finite-dimensional approximation of the HDP is developed for posterior inference
Testing for equal correlation matrices with application to paired gene expression data
We present a novel method for testing the hypothesis of equality of two
correlation matrices using paired high-dimensional datasets. We consider test
statistics based on the average of squares, maximum and sum of exceedances of
Fisher transform sample correlations and we derive approximate null
distributions using asymptotic and non-parametric distributions. Theoretical
results on the power of the tests are presented and backed up by a range of
simulation experiments. We apply the methodology to a case study of colorectal
tumour gene expression data with the aim of discovering biological pathway
lists of genes that present significantly different correlation matrices on
healthy and tumour samples. We find strong evidence for a large part of the
pathway lists correlation matrices to change among the two medical conditions.Comment: 31 pages, 3 figure
Use of didactic terminology by teachers at various stages of professional communication
The article is based on the study that involved 115 students acquiring professional pedagogical education and 115 practicing teachers. The article describes the process of emergence of individual conceptual and terminological frameworks during various stages of professional communication, i.e. training and pedagogical activity. Individual frameworks of concepts have been studied through comparison of interpretations of definitions of basic didactic concepts by respondents (a total of 3487 definitions have been processed), 353 concept maps, as well as wordings of professionally significant problems in which didactic terms were used as well (a total of 400 statements have been analyzed). Factors that influence the nature of how teachers use didactic terms in various instances of professional communication have been described
Adaptive density estimation based on a mixture of Gammas
We consider the problem of Bayesian density estimation on the positive
semiline for possibly unbounded densities. We propose a hierarchical Bayesian
estimator based on the gamma mixture prior which can be viewed as a location
mixture. We study convergence rates of Bayesian density estimators based on
such mixtures. We construct approximations of the local H\"older densities, and
of their extension to unbounded densities, to be continuous mixtures of gamma
distributions, leading to approximations of such densities by finite mixtures.
These results are then used to derive posterior concentration rates, with
priors based on these mixture models. The rates are minimax (up to a log n
term) and since the priors are independent of the smoothness the rates are
adaptive to the smoothness
The Bernstein-von Mises theorem and non-regular models
We study the asymptotic behaviour of the posterior distribution in a broad
class of statistical models where the "true" solution occurs on the boundary of
the parameter space. We show that in this case Bayesian inference is
consistent, and that the posterior distribution has not only Gaussian
components as in the case of regular models (the Bernstein-von Mises theorem)
but also has Gamma distribution components whose form depends on the behaviour
of the prior distribution near the boundary and have a faster rate of
convergence. We also demonstrate a remarkable property of Bayesian inference,
that for some models, there appears to be no bound on efficiency of estimating
the unknown parameter if it is on the boundary of the parameter space. We
illustrate the results on a problem from emission tomography.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1239 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org