8,614 research outputs found
Bayesian comparison of latent variable models: Conditional vs marginal likelihoods
Typical Bayesian methods for models with latent variables (or random effects)
involve directly sampling the latent variables along with the model parameters.
In high-level software code for model definitions (using, e.g., BUGS, JAGS,
Stan), the likelihood is therefore specified as conditional on the latent
variables. This can lead researchers to perform model comparisons via
conditional likelihoods, where the latent variables are considered model
parameters. In other settings, however, typical model comparisons involve
marginal likelihoods where the latent variables are integrated out. This
distinction is often overlooked despite the fact that it can have a large
impact on the comparisons of interest. In this paper, we clarify and illustrate
these issues, focusing on the comparison of conditional and marginal Deviance
Information Criteria (DICs) and Watanabe-Akaike Information Criteria (WAICs) in
psychometric modeling. The conditional/marginal distinction corresponds to
whether the model should be predictive for the clusters that are in the data or
for new clusters (where "clusters" typically correspond to higher-level units
like people or schools). Correspondingly, we show that marginal WAIC
corresponds to leave-one-cluster out (LOcO) cross-validation, whereas
conditional WAIC corresponds to leave-one-unit out (LOuO). These results lead
to recommendations on the general application of the criteria to models with
latent variables.Comment: Manuscript in press at Psychometrika; 31 pages, 8 figure
Accelerating delayed-acceptance Markov chain Monte Carlo algorithms
Delayed-acceptance Markov chain Monte Carlo (DA-MCMC) samples from a
probability distribution via a two-stages version of the Metropolis-Hastings
algorithm, by combining the target distribution with a "surrogate" (i.e. an
approximate and computationally cheaper version) of said distribution. DA-MCMC
accelerates MCMC sampling in complex applications, while still targeting the
exact distribution. We design a computationally faster, albeit approximate,
DA-MCMC algorithm. We consider parameter inference in a Bayesian setting where
a surrogate likelihood function is introduced in the delayed-acceptance scheme.
When the evaluation of the likelihood function is computationally intensive,
our scheme produces a 2-4 times speed-up, compared to standard DA-MCMC.
However, the acceleration is highly problem dependent. Inference results for
the standard delayed-acceptance algorithm and our approximated version are
similar, indicating that our algorithm can return reliable Bayesian inference.
As a computationally intensive case study, we introduce a novel stochastic
differential equation model for protein folding data.Comment: 40 pages, 21 figures, 10 table
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Bayesian inference for stochastic differential equation mixed effects models of a tumor xenography study
We consider Bayesian inference for stochastic differential equation mixed
effects models (SDEMEMs) exemplifying tumor response to treatment and regrowth
in mice. We produce an extensive study on how a SDEMEM can be fitted using both
exact inference based on pseudo-marginal MCMC and approximate inference via
Bayesian synthetic likelihoods (BSL). We investigate a two-compartments SDEMEM,
these corresponding to the fractions of tumor cells killed by and survived to a
treatment, respectively. Case study data considers a tumor xenography study
with two treatment groups and one control, each containing 5-8 mice. Results
from the case study and from simulations indicate that the SDEMEM is able to
reproduce the observed growth patterns and that BSL is a robust tool for
inference in SDEMEMs. Finally, we compare the fit of the SDEMEM to a similar
ordinary differential equation model. Due to small sample sizes, strong prior
information is needed to identify all model parameters in the SDEMEM and it
cannot be determined which of the two models is the better in terms of
predicting tumor growth curves. In a simulation study we find that with a
sample of 17 mice per group BSL is able to identify all model parameters and
distinguish treatment groups.Comment: Minor revision: posterior predictive checks for BSL have ben updated
(both theory and results). Code on GitHub has ben revised accordingl
- …