34,360 research outputs found

    Controversy in statistical analysis of functional magnetic resonance imaging data

    Get PDF
    To test the validity of statistical methods for fMRI data analysis, Eklund et al. (1) used, for the first time, large-scale experimental data rather than simulated data. Using resting-state fMRI measurements to represent a null hypothesis of no task-induced activation, the authors compare familywise error rates for voxel-based and cluster-based inferences for both parametric and nonparametric methods. Eklund et al.’s study used three fMRI statistical analysis packages. They found that, for a target familywise error rate of 5%, the parametric methods gave invalid cluster-based inferences and conservative voxel-based inferences

    Controversy in statistical analysis of functional magnetic resonance imaging data

    Get PDF
    To test the validity of statistical methods for fMRI data analysis, Eklund et al. (1) used, for the first time, large-scale experimental data rather than simulated data. Using resting-state fMRI measurements to represent a null hypothesis of no task-induced activation, the authors compare familywise error rates for voxel-based and cluster-based inferences for both parametric and nonparametric methods. Eklund et al.’s study used three fMRI statistical analysis packages. They found that, for a target familywise error rate of 5%, the parametric methods gave invalid cluster-based inferences and conservative voxel-based inferences

    Can parametric statistical methods be trusted for fMRI based group studies?

    Full text link
    The most widely used task fMRI analyses use parametric methods that depend on a variety of assumptions. While individual aspects of these fMRI models have been evaluated, they have not been evaluated in a comprehensive manner with empirical data. In this work, a total of 2 million random task fMRI group analyses have been performed using resting state fMRI data, to compute empirical familywise error rates for the software packages SPM, FSL and AFNI, as well as a standard non-parametric permutation method. While there is some variation, for a nominal familywise error rate of 5% the parametric statistical methods are shown to be conservative for voxel-wise inference and invalid for cluster-wise inference; in particular, cluster size inference with a cluster defining threshold of p = 0.01 generates familywise error rates up to 60%. We conduct a number of follow up analyses and investigations that suggest the cause of the invalid cluster inferences is spatial auto correlation functions that do not follow the assumed Gaussian shape. By comparison, the non-parametric permutation test, which is based on a small number of assumptions, is found to produce valid results for voxel as well as cluster wise inference. Using real task data, we compare the results between one parametric method and the permutation test, and find stark differences in the conclusions drawn between the two using cluster inference. These findings speak to the need of validating the statistical methods being used in the neuroimaging field

    Cluster Failure Revisited: Impact of First Level Design and Data Quality on Cluster False Positive Rates

    Full text link
    Methodological research rarely generates a broad interest, yet our work on the validity of cluster inference methods for functional magnetic resonance imaging (fMRI) created intense discussion on both the minutia of our approach and its implications for the discipline. In the present work, we take on various critiques of our work and further explore the limitations of our original work. We address issues about the particular event-related designs we used, considering multiple event types and randomisation of events between subjects. We consider the lack of validity found with one-sample permutation (sign flipping) tests, investigating a number of approaches to improve the false positive control of this widely used procedure. We found that the combination of a two-sided test and cleaning the data using ICA FIX resulted in nominal false positive rates for all datasets, meaning that data cleaning is not only important for resting state fMRI, but also for task fMRI. Finally, we discuss the implications of our work on the fMRI literature as a whole, estimating that at least 10% of the fMRI studies have used the most problematic cluster inference method (P = 0.01 cluster defining threshold), and how individual studies can be interpreted in light of our findings. These additional results underscore our original conclusions, on the importance of data sharing and thorough evaluation of statistical methods on realistic null data

    Evaluation of second-level inference in fMRI analysis

    Get PDF
    We investigate the impact of decisions in the second-level (i.e., over subjects) inferential process in functional magnetic resonance imaging on (1) the balance between false positives and false negatives and on (2) the data-analytical stability, both proxies for the reproducibility of results. Second-level analysis based on a mass univariate approach typically consists of 3 phases. First, one proceeds via a general linear model for a test image that consists of pooled information from different subjects. We evaluate models that take into account first-level (within-subjects) variability and models that do not take into account this variability. Second, one proceeds via inference based on parametrical assumptions or via permutation-based inference. Third, we evaluate 3 commonly used procedures to address the multiple testing problem: familywise error rate correction, False Discovery Rate (FDR) correction, and a two-step procedure with minimal cluster size. Based on a simulation study and real data we find that the two-step procedure with minimal cluster size results in most stable results, followed by the familywise error rate correction. The FDR results in most variable results, for both permutation-based inference and parametrical inference. Modeling the subject-specific variability yields a better balance between false positives and false negatives when using parametric inference

    Bayesian multi-modal model comparison: a case study on the generators of the spike and the wave in generalized spike–wave complexes

    Get PDF
    We present a novel approach to assess the networks involved in the generation of spontaneous pathological brain activity based on multi-modal imaging data. We propose to use probabilistic fMRI-constrained EEG source reconstruction as a complement to EEG-correlated fMRI analysis to disambiguate between networks that co-occur at the fMRI time resolution. The method is based on Bayesian model comparison, where the different models correspond to different combinations of fMRI-activated (or deactivated) cortical clusters. By computing the model evidence (or marginal likelihood) of each and every candidate source space partition, we can infer the most probable set of fMRI regions that has generated a given EEG scalp data window. We illustrate the method using EEG-correlated fMRI data acquired in a patient with ictal generalized spike–wave (GSW) discharges, to examine whether different networks are involved in the generation of the spike and the wave components, respectively. To this effect, we compared a family of 128 EEG source models, based on the combinations of seven regions haemodynamically involved (deactivated) during a prolonged ictal GSW discharge, namely: bilateral precuneus, bilateral medial frontal gyrus, bilateral middle temporal gyrus, and right cuneus. Bayesian model comparison has revealed the most likely model associated with the spike component to consist of a prefrontal region and bilateral temporal–parietal regions and the most likely model associated with the wave component to comprise the same temporal–parietal regions only. The result supports the hypothesis of different neurophysiological mechanisms underlying the generation of the spike versus wave components of GSW discharges

    The empirical replicability of task-based fMRI as a function of sample size

    Get PDF
    Replicating results (i.e. obtaining consistent results using a new independent dataset) is an essential part of good science. As replicability has consequences for theories derived from empirical studies, it is of utmost importance to better understand the underlying mechanisms influencing it. A popular tool for non-invasive neuroimaging studies is functional magnetic resonance imaging (fMRI). While the effect of underpowered studies is well documented, the empirical assessment of the interplay between sample size and replicability of results for task-based fMRI studies remains limited. In this work, we extend existing work on this assessment in two ways. Firstly, we use a large database of 1400 subjects performing four types of tasks from the IMAGEN project to subsample a series of independent samples of increasing size. Secondly, replicability is evaluated using a multi-dimensional framework consisting of 3 different measures: (un)conditional test-retest reliability, coherence and stability. We demonstrate not only a positive effect of sample size, but also a trade-off between spatial resolution and replicability. When replicability is assessed voxelwise or when observing small areas of activation, a larger sample size than typically used in fMRI is required to replicate results. On the other hand, when focussing on clusters of voxels, we observe a higher replicability. In addition, we observe variability in the size of clusters of activation between experimental paradigms or contrasts of parameter estimates within these
    • …
    corecore