16,079 research outputs found

    PARALLEL INDEPENDENT COMPONENT ANALYSIS WITH REFERENCE FOR IMAGING GENETICS: A SEMI-BLIND MULTIVARIATE APPROACH

    Get PDF
    Imaging genetics is an emerging field dedicated to the study of genetic underpinnings of brain structure and function. Over the last decade, brain imaging techniques such as magnetic resonance imaging (MRI) have been increasingly applied to measure morphometry, task-based function and connectivity in living brains. Meanwhile, high-throughput genotyping employing genome-wide techniques has made it feasible to sample the entire genome of a substantial number of individuals. While there is growing interest in image-wide and genome-wide approaches which allow unbiased searches over a large range of variants, one of the most challenging problems is the correction for the huge number of statistical tests used in univariate models. In contrast, a reference-guided multivariate approach shows specific advantage for simultaneously assessing many variables for aggregate effects while leveraging prior information. It can improve the robustness of the results compared to a fully blind approach. In this dissertation we present a semi-blind multivariate approach, parallel independent component analysis with reference (pICA-R), to better reveal relationships between hidden factors of particular attributes. First, a consistency-based order estimation approach is introduced to advance the application of ICA to genotype data. The pICA-R approach is then presented, where independent components are extracted from two modalities in parallel and inter-modality associations are subsequently optimized for pairs of components. In particular, prior information is incorporated to elicit components of particular interests, which helps identify factors carrying small amounts of variance in large complex datasets. The pICA-R approach is further extended to accommodate multiple references whose interrelationships are unknown, allowing the investigation of functional influence on neurobiological traits of potentially related genetic variants implicated in biology. Applied to a schizophrenia study, pICA-R reveals that a complex genetic factor involving multiple pathways underlies schizophrenia-related gray matter deficits in prefrontal and temporal regions. The extended multi-reference approach, when employed to study alcohol dependence, delineates a complex genetic architecture, where the CREB-BDNF pathway plays a key role in the genetic factor underlying a proportion of variation in cue-elicited brain activations, which plays a role in phenotypic symptoms of alcohol dependence. In summary, our work makes several important contributions to advance the application of ICA to imaging genetics studies, which holds the promise to improve our understating of genetics underlying brain structure and function in healthy and disease

    Scaling Up Large-scale Sparse Learning and Its Application to Medical Imaging

    Get PDF
    abstract: Large-scale â„“1\ell_1-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy is to scaling up the optimization problem in parallel. Parallel solvers run multiple cores on a shared memory system or a distributed environment to speed up the computation, while the practical usage is limited by the huge dimension in the feature space and synchronization problems. In this dissertation, I carry out the research along the direction with particular focuses on scaling up the optimization of sparse learning for supervised and unsupervised learning problems. For the supervised learning, I firstly propose an asynchronous parallel solver to optimize the large-scale sparse learning model in a multithreading environment. Moreover, I propose a distributed framework to conduct the learning process when the dataset is distributed stored among different machines. Then the proposed model is further extended to the studies of risk genetic factors for Alzheimer's Disease (AD) among different research institutions, integrating a group feature selection framework to rank the top risk SNPs for AD. For the unsupervised learning problem, I propose a highly efficient solver, termed Stochastic Coordinate Coding (SCC), scaling up the optimization of dictionary learning and sparse coding problems. The common issue for the medical imaging research is that the longitudinal features of patients among different time points are beneficial to study together. To further improve the dictionary learning model, I propose a multi-task dictionary learning method, learning the different task simultaneously and utilizing shared and individual dictionary to encode both consistent and changing imaging features.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    • …
    corecore