40,686 research outputs found
Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data
Recently, new methods for model assessment, based on subsampling and
posterior approximations, have been proposed for scaling leave-one-out
cross-validation (LOO) to large datasets. Although these methods work well for
estimating predictive performance for individual models, they are less powerful
in model comparison. We propose an efficient method for estimating differences
in predictive performance by combining fast approximate LOO surrogates with
exact LOO subsampling using the difference estimator and supply proofs with
regards to scaling characteristics. The resulting approach can be orders of
magnitude more efficient than previous approaches, as well as being better
suited to model comparison
Bayesian comparison of latent variable models: Conditional vs marginal likelihoods
Typical Bayesian methods for models with latent variables (or random effects)
involve directly sampling the latent variables along with the model parameters.
In high-level software code for model definitions (using, e.g., BUGS, JAGS,
Stan), the likelihood is therefore specified as conditional on the latent
variables. This can lead researchers to perform model comparisons via
conditional likelihoods, where the latent variables are considered model
parameters. In other settings, however, typical model comparisons involve
marginal likelihoods where the latent variables are integrated out. This
distinction is often overlooked despite the fact that it can have a large
impact on the comparisons of interest. In this paper, we clarify and illustrate
these issues, focusing on the comparison of conditional and marginal Deviance
Information Criteria (DICs) and Watanabe-Akaike Information Criteria (WAICs) in
psychometric modeling. The conditional/marginal distinction corresponds to
whether the model should be predictive for the clusters that are in the data or
for new clusters (where "clusters" typically correspond to higher-level units
like people or schools). Correspondingly, we show that marginal WAIC
corresponds to leave-one-cluster out (LOcO) cross-validation, whereas
conditional WAIC corresponds to leave-one-unit out (LOuO). These results lead
to recommendations on the general application of the criteria to models with
latent variables.Comment: Manuscript in press at Psychometrika; 31 pages, 8 figure
Bayesian leave-one-out cross-validation for large data
Model inference, such as model comparison, model checking, and model
selection, is an important part of model development. Leave-one-out
cross-validation (LOO) is a general approach for assessing the generalizability
of a model, but unfortunately, LOO does not scale well to large datasets. We
propose a combination of using approximate inference techniques and
probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation
for large datasets. We provide both theoretical and empirical results showing
good properties for large data.Comment: Accepted to ICML 2019. This version is the submitted pape
A tutorial on group effective connectivity analysis, part 2: second level analysis with PEB
This tutorial provides a worked example of using Dynamic Causal Modelling
(DCM) and Parametric Empirical Bayes (PEB) to characterise inter-subject
variability in neural circuitry (effective connectivity). This involves
specifying a hierarchical model with two or more levels. At the first level,
state space models (DCMs) are used to infer the effective connectivity that
best explains a subject's neuroimaging timeseries (e.g. fMRI, MEG, EEG).
Subject-specific connectivity parameters are then taken to the group level,
where they are modelled using a General Linear Model (GLM) that partitions
between-subject variability into designed effects and additive random effects.
The ensuing (Bayesian) hierarchical model conveys both the estimated connection
strengths and their uncertainty (i.e., posterior covariance) from the subject
to the group level; enabling hypotheses to be tested about the commonalities
and differences across subjects. This approach can also finesse parameter
estimation at the subject level, by using the group-level parameters as
empirical priors. We walk through this approach in detail, using data from a
published fMRI experiment that characterised individual differences in
hemispheric lateralization in a semantic processing task. The preliminary
subject specific DCM analysis is covered in detail in a companion paper. This
tutorial is accompanied by the example dataset and step-by-step instructions to
reproduce the analyses
- …