279 research outputs found

    Bayesian Spatial Binary Regression for Label Fusion in Structural Neuroimaging

    Full text link
    Many analyses of neuroimaging data involve studying one or more regions of interest (ROIs) in a brain image. In order to do so, each ROI must first be identified. Since every brain is unique, the location, size, and shape of each ROI varies across subjects. Thus, each ROI in a brain image must either be manually identified or (semi-) automatically delineated, a task referred to as segmentation. Automatic segmentation often involves mapping a previously manually segmented image to a new brain image and propagating the labels to obtain an estimate of where each ROI is located in the new image. A more recent approach to this problem is to propagate labels from multiple manually segmented atlases and combine the results using a process known as label fusion. To date, most label fusion algorithms either employ voting procedures or impose prior structure and subsequently find the maximum a posteriori estimator (i.e., the posterior mode) through optimization. We propose using a fully Bayesian spatial regression model for label fusion that facilitates direct incorporation of covariate information while making accessible the entire posterior distribution. We discuss the implementation of our model via Markov chain Monte Carlo and illustrate the procedure through both simulation and application to segmentation of the hippocampus, an anatomical structure known to be associated with Alzheimer's disease.Comment: 24 pages, 10 figure

    Soft Null Hypotheses: A Case Study of Image Enhancement Detection in Brain Lesions

    Get PDF
    This work is motivated by a study of a population of multiple sclerosis (MS) patients using dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to identify active brain lesions. At each visit, a contrast agent is administered intravenously to a subject and a series of images is acquired to reveal the location and activity of MS lesions within the brain. Our goal is to identify and quantify lesion enhancement location at the subject level and lesion enhancement patterns at the population level. With this example, we aim to address the difficult problem of transforming a qualitative scientific null hypothesis, such as "this voxel does not enhance", to a well-defined and numerically testable null hypothesis based on existing data. We call the procedure "soft null hypothesis" testing as opposed to the standard "hard null hypothesis" testing. This problem is fundamentally different from: 1) testing when a quantitative null hypothesis is given; 2) clustering using a mixture distribution; or 3) identifying a reasonable threshold with a parametric null assumption. We analyze a total of 20 subjects scanned at 63 visits (~30Gb), the largest population of such clinical brain images

    A BROAD SYMMETRY CRITERION FOR NONPARAMETRIC VALIDITY OF PARAMETRICALLY-BASED TESTS IN RANDOMIZED TRIALS

    Get PDF
    Summary. Pilot phases of a randomized clinical trial often suggest that a parametric model may be an accurate description of the trial\u27s longitudinal trajectories. However, parametric models are often not used for fear that they may invalidate tests of null hypotheses of equality between the experimental groups. Existing work has shown that when, for some types of data, certain parametric models are used, the validity for testing the null is preserved even if the parametric models are incorrect. Here, we provide a broader and easier to check characterization of parametric models that can be used to (a) preserve nonparametric validity of testing the null hypothesis, i.e., even when the models are incorrect, and (b) increase power compared to the non- or semiparametric bounds when the models are close to correct. We demonstrate our results in a clinical trial of depression in Alzheimer\u27s patients

    The emergent integrated network structure of scientific research

    Full text link
    The practice of scientific research is often thought of as individuals and small teams striving for disciplinary advances. Yet as a whole, this endeavor more closely resembles a complex system of natural computation, in which information is obtained, generated, and disseminated more effectively than would be possible by individuals acting in isolation. Currently, the structure of this integrated and innovative landscape of scientific ideas is not well understood. Here we use tools from network science to map the landscape of interconnected research topics covered in the multidisciplinary journal PNAS since 2000. We construct networks in which nodes represent topics of study and edges give the degree to which topics occur in the same papers. The network displays small-world architecture, with dense connectivity within scientific clusters and sparse connectivity between clusters. Notably, clusters tend not to align with assigned article classifications, but instead contain topics from various disciplines. Using a temporal graph, we find that small-worldness has increased over time, suggesting growing efficiency and integration of ideas. Finally, we define a novel measure of interdisciplinarity, which is positively associated with PNAS's impact factor. Broadly, this work suggests that complex and dynamic patterns of knowledge emerge from scientific research, and that structures reflecting intellectual integration may be beneficial for obtaining scientific insight

    Estimating effects by combining instrumental variables with case-control designs: the role of principal stratification

    Get PDF
    The instrumental variable framework is commonly used in the estimation of causal effects from cohort samples. In the case of more efficient designs such as the case-control study, however, the combination of the instrumental variable and complex sampling designs requires new methodological consideration. As the prevalence of Mendelian randomization studies is increasing and the cost of genotyping and expression data can be high, the analysis of data gathered from more cost-effective sampling designs is of prime interest. We show that the standard instrumental variable analysis is not applicable to the case-control design and can lead to erroneous estimation and inference. We also propose a method based on principal stratification for the analysis of data arising from the combination of case-control sampling and instrumental variable design and illustrate it with a study in oncology

    Covariance Assisted Multivariate Penalized Additive Regression (CoMPAdRe)

    Full text link
    We propose a new method for the simultaneous selection and estimation of multivariate sparse additive models with correlated errors. Our method called Covariance Assisted Multivariate Penalized Additive Regression (CoMPAdRe) simultaneously selects among null, linear, and smooth non-linear effects for each predictor while incorporating joint estimation of the sparse residual structure among responses, with the motivation that accounting for inter-response correlation structure can lead to improved accuracy in variable selection and estimation efficiency. CoMPAdRe is constructed in a computationally efficient way that allows the selection and estimation of linear and non-linear covariates to be conducted in parallel across responses. Compared to single-response approaches that marginally select linear and non-linear covariate effects, we demonstrate in simulation studies that the joint multivariate modeling leads to gains in both estimation efficiency and selection accuracy, of greater magnitude in settings where signal is moderate relative to the level of noise. We apply our approach to protein-mRNA expression levels from multiple breast cancer pathways obtained from The Cancer Proteome Atlas and characterize both mRNA-protein associations and protein-protein subnetworks for each pathway. We find non-linear mRNA-protein associations for the Core Reactive, EMT, PIK-AKT, and RTK pathways

    Control-Group Feature Normalization for Multivariate Pattern Analysis Using the Support Vector Machine

    Get PDF
    Normalization of feature vector values is a common practice in machine learning. Generally, each feature value is standardized to the unit hypercube or by normalizing to zero mean and unit variance. Classification decisions based on support vector machines (SVMs) or by other methods are sensitive to the specific normalization used on the features. In the context of multivariate pattern analysis using neuroimaging data, standardization effectively up- and down-weights features based on their individual variability. Since the standard approach uses the entire data set to guide the normalization it utilizes the total variability of these features. This total variation is inevitably dependent on the amount of marginal separation between groups. Thus, such a normalization may attenuate the separability of the data in high dimensional space. In this work we propose an alternate approach that uses an estimate of the control-group standard deviation to normalize features before training. We also show that control-based normalization provides better interpretation with respect to the estimated multivariate disease pattern and improves the classifier performance in many cases

    Addressing Confounding in Predictive Models with an Application to Neuroimaging

    Get PDF
    Understanding structural changes in the brain that are caused by a particular disease is a major goal of neuroimaging research. Multivariate pattern analysis (MVPA) comprises a collection of tools that can be used to understand complex disease effects across the brain. We discuss several important issues that must be considered when analyzing data from neuroimaging studies using MVPA. In particular, we focus on the consequences of confounding by non-imaging variables such as age and sex on the results of MVPA. After reviewing current practice to address confounding in neuroimaging studies, we propose an alternative approach based on inverse probability weighting. Although the proposed method is motivated by neuroimaging applications, it is broadly applicable to many problems in machine learning and predictive modeling. We demonstrate the advantages of our approach on simulated and real data examples
    corecore