1,934 research outputs found
Identifying progressive imaging genetic patterns via multi-task sparse canonical correlation analysis: a longitudinal study of the ADNI cohort
Motivation
Identifying the genetic basis of the brain structure, function and disorder by using the imaging quantitative traits (QTs) as endophenotypes is an important task in brain science. Brain QTs often change over time while the disorder progresses and thus understanding how the genetic factors play roles on the progressive brain QT changes is of great importance and meaning. Most existing imaging genetics methods only analyze the baseline neuroimaging data, and thus those longitudinal imaging data across multiple time points containing important disease progression information are omitted.
Results
We propose a novel temporal imaging genetic model which performs the multi-task sparse canonical correlation analysis (T-MTSCCA). Our model uses longitudinal neuroimaging data to uncover that how single nucleotide polymorphisms (SNPs) play roles on affecting brain QTs over the time. Incorporating the relationship of the longitudinal imaging data and that within SNPs, T-MTSCCA could identify a trajectory of progressive imaging genetic patterns over the time. We propose an efficient algorithm to solve the problem and show its convergence. We evaluate T-MTSCCA on 408 subjects from the Alzheimer’s Disease Neuroimaging Initiative database with longitudinal magnetic resonance imaging data and genetic data available. The experimental results show that T-MTSCCA performs either better than or equally to the state-of-the-art methods. In particular, T-MTSCCA could identify higher canonical correlation coefficients and capture clearer canonical weight patterns. This suggests that T-MTSCCA identifies time-consistent and time-dependent SNPs and imaging QTs, which further help understand the genetic basis of the brain QT changes over the time during the disease progression.
Availability and implementation
The software and simulation data are publicly available at https://github.com/dulei323/TMTSCCA.
Supplementary information
Supplementary data are available at Bioinformatics online
Sparse reduced-rank regression for imaging genetics studies: models and applications
We present a novel statistical technique; the sparse reduced rank regression (sRRR) model
which is a strategy for multivariate modelling of high-dimensional imaging responses and
genetic predictors. By adopting penalisation techniques, the model is able to enforce sparsity
in the regression coefficients, identifying subsets of genetic markers that best explain
the variability observed in subsets of the phenotypes. To properly exploit the rich structure
present in each of the imaging and genetics domains, we additionally propose the use of
several structured penalties within the sRRR model. Using simulation procedures that accurately
reflect realistic imaging genetics data, we present detailed evaluations of the sRRR
method in comparison with the more traditional univariate linear modelling approach. In
all settings considered, we show that sRRR possesses better power to detect the deleterious
genetic variants. Moreover, using a simple genetic model, we demonstrate the potential
benefits, in terms of statistical power, of carrying out voxel-wise searches as opposed to
extracting averages over regions of interest in the brain. Since this entails the use of phenotypic
vectors of enormous dimensionality, we suggest the use of a sparse classification
model as a de-noising step, prior to the imaging genetics study. Finally, we present the
application of a data re-sampling technique within the sRRR model for model selection.
Using this approach we are able to rank the genetic markers in order of importance of association
to the phenotypes, and similarly rank the phenotypes in order of importance to
the genetic markers. In the very end, we illustrate the application perspective of the proposed
statistical models in three real imaging genetics datasets and highlight some potential
associations
Structured Sparse Methods for Imaging Genetics
abstract: Imaging genetics is an emerging and promising technique that investigates how genetic variations affect brain development, structure, and function. By exploiting disorder-related neuroimaging phenotypes, this class of studies provides a novel direction to reveal and understand the complex genetic mechanisms. Oftentimes, imaging genetics studies are challenging due to the relatively small number of subjects but extremely high-dimensionality of both imaging data and genomic data. In this dissertation, I carry on my research on imaging genetics with particular focuses on two tasks---building predictive models between neuroimaging data and genomic data, and identifying disorder-related genetic risk factors through image-based biomarkers. To this end, I consider a suite of structured sparse methods---that can produce interpretable models and are robust to overfitting---for imaging genetics. With carefully-designed sparse-inducing regularizers, different biological priors are incorporated into learning models. More specifically, in the Allen brain image--gene expression study, I adopt an advanced sparse coding approach for image feature extraction and employ a multi-task learning approach for multi-class annotation. Moreover, I propose a label structured-based two-stage learning framework, which utilizes the hierarchical structure among labels, for multi-label annotation. In the Alzheimer's disease neuroimaging initiative (ADNI) imaging genetics study, I employ Lasso together with EDPP (enhanced dual polytope projections) screening rules to fast identify Alzheimer's disease risk SNPs. I also adopt the tree-structured group Lasso with MLFre (multi-layer feature reduction) screening rules to incorporate linkage disequilibrium information into modeling. Moreover, I propose a novel absolute fused Lasso model for ADNI imaging genetics. This method utilizes SNP spatial structure and is robust to the choice of reference alleles of genotype coding. In addition, I propose a two-level structured sparse model that incorporates gene-level networks through a graph penalty into SNP-level model construction. Lastly, I explore a convolutional neural network approach for accurate predicting Alzheimer's disease related imaging phenotypes. Experimental results on real-world imaging genetics applications demonstrate the efficiency and effectiveness of the proposed structured sparse methods.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Variable selection and regression analysis for graph-structured covariates with an application to genomics
Graphs and networks are common ways of depicting biological information. In
biology, many different biological processes are represented by graphs, such as
regulatory networks, metabolic pathways and protein--protein interaction
networks. This kind of a priori use of graphs is a useful supplement to the
standard numerical data such as microarray gene expression data. In this paper
we consider the problem of regression analysis and variable selection when the
covariates are linked on a graph. We study a graph-constrained regularization
procedure and its theoretical properties for regression analysis to take into
account the neighborhood information of the variables measured on a graph. This
procedure involves a smoothness penalty on the coefficients that is defined as
a quadratic form of the Laplacian matrix associated with the graph. We
establish estimation and model selection consistency results and provide
estimation bounds for both fixed and diverging numbers of parameters in
regression models. We demonstrate by simulations and a real data set that the
proposed procedure can lead to better variable selection and prediction than
existing methods that ignore the graph information associated with the
covariates.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS332 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
SUBIC: A Supervised Bi-Clustering Approach for Precision Medicine
Traditional medicine typically applies one-size-fits-all treatment for the
entire patient population whereas precision medicine develops tailored
treatment schemes for different patient subgroups. The fact that some factors
may be more significant for a specific patient subgroup motivates clinicians
and medical researchers to develop new approaches to subgroup detection and
analysis, which is an effective strategy to personalize treatment. In this
study, we propose a novel patient subgroup detection method, called Supervised
Biclustring (SUBIC) using convex optimization and apply our approach to detect
patient subgroups and prioritize risk factors for hypertension (HTN) in a
vulnerable demographic subgroup (African-American). Our approach not only finds
patient subgroups with guidance of a clinically relevant target variable but
also identifies and prioritizes risk factors by pursuing sparsity of the input
variables and encouraging similarity among the input variables and between the
input and target variable
- …