34,360 research outputs found
Controversy in statistical analysis of functional magnetic resonance imaging data
To test the validity of statistical methods for fMRI data analysis, Eklund et al. (1) used, for the first time, large-scale experimental data rather than simulated data. Using resting-state fMRI measurements to represent a null hypothesis of no task-induced activation, the authors compare familywise error rates for voxel-based and cluster-based inferences for both parametric and nonparametric methods. Eklund et al.’s study used three fMRI statistical analysis packages. They found that, for a target familywise error rate of 5%, the parametric methods gave invalid cluster-based inferences and conservative voxel-based inferences
Controversy in statistical analysis of functional magnetic resonance imaging data
To test the validity of statistical methods for fMRI data analysis, Eklund et al. (1) used, for the first time, large-scale experimental data rather than simulated data. Using resting-state fMRI measurements to represent a null hypothesis of no task-induced activation, the authors compare familywise error rates for voxel-based and cluster-based inferences for both parametric and nonparametric methods. Eklund et al.’s study used three fMRI statistical analysis packages. They found that, for a target familywise error rate of 5%, the parametric methods gave invalid cluster-based inferences and conservative voxel-based inferences
Can parametric statistical methods be trusted for fMRI based group studies?
The most widely used task fMRI analyses use parametric methods that depend on
a variety of assumptions. While individual aspects of these fMRI models have
been evaluated, they have not been evaluated in a comprehensive manner with
empirical data. In this work, a total of 2 million random task fMRI group
analyses have been performed using resting state fMRI data, to compute
empirical familywise error rates for the software packages SPM, FSL and AFNI,
as well as a standard non-parametric permutation method. While there is some
variation, for a nominal familywise error rate of 5% the parametric statistical
methods are shown to be conservative for voxel-wise inference and invalid for
cluster-wise inference; in particular, cluster size inference with a cluster
defining threshold of p = 0.01 generates familywise error rates up to 60%. We
conduct a number of follow up analyses and investigations that suggest the
cause of the invalid cluster inferences is spatial auto correlation functions
that do not follow the assumed Gaussian shape. By comparison, the
non-parametric permutation test, which is based on a small number of
assumptions, is found to produce valid results for voxel as well as cluster
wise inference. Using real task data, we compare the results between one
parametric method and the permutation test, and find stark differences in the
conclusions drawn between the two using cluster inference. These findings speak
to the need of validating the statistical methods being used in the
neuroimaging field
Cluster Failure Revisited: Impact of First Level Design and Data Quality on Cluster False Positive Rates
Methodological research rarely generates a broad interest, yet our work on
the validity of cluster inference methods for functional magnetic resonance
imaging (fMRI) created intense discussion on both the minutia of our approach
and its implications for the discipline. In the present work, we take on
various critiques of our work and further explore the limitations of our
original work. We address issues about the particular event-related designs we
used, considering multiple event types and randomisation of events between
subjects. We consider the lack of validity found with one-sample permutation
(sign flipping) tests, investigating a number of approaches to improve the
false positive control of this widely used procedure. We found that the
combination of a two-sided test and cleaning the data using ICA FIX resulted in
nominal false positive rates for all datasets, meaning that data cleaning is
not only important for resting state fMRI, but also for task fMRI. Finally, we
discuss the implications of our work on the fMRI literature as a whole,
estimating that at least 10% of the fMRI studies have used the most problematic
cluster inference method (P = 0.01 cluster defining threshold), and how
individual studies can be interpreted in light of our findings. These
additional results underscore our original conclusions, on the importance of
data sharing and thorough evaluation of statistical methods on realistic null
data
Evaluation of second-level inference in fMRI analysis
We investigate the impact of decisions in the second-level (i.e., over subjects) inferential process in functional magnetic resonance imaging on (1) the balance between false positives and false negatives and on (2) the data-analytical stability, both proxies for the reproducibility of results. Second-level analysis based on a mass univariate approach typically consists of 3 phases. First, one proceeds via a general linear model for a test image that consists of pooled information from different subjects. We evaluate models that take into account first-level (within-subjects) variability and models that do not take into account this variability. Second, one proceeds via inference based on parametrical assumptions or via permutation-based inference. Third, we evaluate 3 commonly used procedures to address the multiple testing problem: familywise error rate correction, False Discovery Rate (FDR) correction, and a two-step procedure with minimal cluster size. Based on a simulation study and real data we find that the two-step procedure with minimal cluster size results in most stable results, followed by the familywise error rate correction. The FDR results in most variable results, for both permutation-based inference and parametrical inference. Modeling the subject-specific variability yields a better balance between false positives and false negatives when using parametric inference
Bayesian multi-modal model comparison: a case study on the generators of the spike and the wave in generalized spike–wave complexes
We present a novel approach to assess the networks involved in the generation of spontaneous pathological brain activity based on multi-modal imaging data. We propose to use probabilistic fMRI-constrained EEG source reconstruction as a complement to EEG-correlated fMRI analysis to disambiguate between networks that co-occur at the fMRI time resolution. The method is based on Bayesian model comparison, where the different models correspond to different combinations of fMRI-activated (or deactivated) cortical clusters. By computing the model evidence (or marginal likelihood) of each and every candidate source space partition, we can infer the most probable set of fMRI regions that has generated a given EEG scalp data window. We illustrate the method using EEG-correlated fMRI data acquired in a patient with ictal generalized spike–wave (GSW) discharges, to examine whether different networks are involved in the generation of the spike and the wave components, respectively. To this effect, we compared a family of 128 EEG source models, based on the combinations of seven regions haemodynamically involved (deactivated) during a prolonged ictal GSW discharge, namely: bilateral precuneus, bilateral medial frontal gyrus, bilateral middle temporal gyrus, and right cuneus. Bayesian model comparison has revealed the most likely model associated with the spike component to consist of a prefrontal region and bilateral temporal–parietal regions and the most likely model associated with the wave component to comprise the same temporal–parietal regions only. The result supports the hypothesis of different neurophysiological mechanisms underlying the generation of the spike versus wave components of GSW discharges
The empirical replicability of task-based fMRI as a function of sample size
Replicating results (i.e. obtaining consistent results using a new independent dataset) is an essential part of good science. As replicability has consequences for theories derived from empirical studies, it is of utmost importance to better understand the underlying mechanisms influencing it. A popular tool for non-invasive neuroimaging studies is functional magnetic resonance imaging (fMRI). While the effect of underpowered studies is well documented, the empirical assessment of the interplay between sample size and replicability of results for task-based fMRI studies remains limited. In this work, we extend existing work on this assessment in two ways. Firstly, we use a large database of 1400 subjects performing four types of tasks from the IMAGEN project to subsample a series of independent samples of increasing size. Secondly, replicability is evaluated using a multi-dimensional framework consisting of 3 different measures: (un)conditional test-retest reliability, coherence and stability. We demonstrate not only a positive effect of sample size, but also a trade-off between spatial resolution and replicability. When replicability is assessed voxelwise or when observing small areas of activation, a larger sample size than typically used in fMRI is required to replicate results. On the other hand, when focussing on clusters of voxels, we observe a higher replicability. In addition, we observe variability in the size of clusters of activation between experimental paradigms or contrasts of parameter estimates within these
- …