2,286 research outputs found
Identification of gene pathways implicated in Alzheimer's disease using longitudinal imaging phenotypes with sparse regression
We present a new method for the detection of gene pathways associated with a
multivariate quantitative trait, and use it to identify causal pathways
associated with an imaging endophenotype characteristic of longitudinal
structural change in the brains of patients with Alzheimer's disease (AD). Our
method, known as pathways sparse reduced-rank regression (PsRRR), uses group
lasso penalised regression to jointly model the effects of genome-wide single
nucleotide polymorphisms (SNPs), grouped into functional pathways using prior
knowledge of gene-gene interactions. Pathways are ranked in order of importance
using a resampling strategy that exploits finite sample variability. Our
application study uses whole genome scans and MR images from 464 subjects in
the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. 66,182 SNPs
are mapped to 185 gene pathways from the KEGG pathways database. Voxel-wise
imaging signatures characteristic of AD are obtained by analysing 3D patterns
of structural change at 6, 12 and 24 months relative to baseline. High-ranking,
AD endophenotype-associated pathways in our study include those describing
chemokine, Jak-stat and insulin signalling pathways, and tight junction
interactions. All of these have been previously implicated in AD biology. In a
secondary analysis, we investigate SNPs and genes that may be driving pathway
selection, and identify a number of previously validated AD genes including
CR1, APOE and TOMM40
Identifying progressive imaging genetic patterns via multi-task sparse canonical correlation analysis: a longitudinal study of the ADNI cohort
Motivation
Identifying the genetic basis of the brain structure, function and disorder by using the imaging quantitative traits (QTs) as endophenotypes is an important task in brain science. Brain QTs often change over time while the disorder progresses and thus understanding how the genetic factors play roles on the progressive brain QT changes is of great importance and meaning. Most existing imaging genetics methods only analyze the baseline neuroimaging data, and thus those longitudinal imaging data across multiple time points containing important disease progression information are omitted.
Results
We propose a novel temporal imaging genetic model which performs the multi-task sparse canonical correlation analysis (T-MTSCCA). Our model uses longitudinal neuroimaging data to uncover that how single nucleotide polymorphisms (SNPs) play roles on affecting brain QTs over the time. Incorporating the relationship of the longitudinal imaging data and that within SNPs, T-MTSCCA could identify a trajectory of progressive imaging genetic patterns over the time. We propose an efficient algorithm to solve the problem and show its convergence. We evaluate T-MTSCCA on 408 subjects from the Alzheimerās Disease Neuroimaging Initiative database with longitudinal magnetic resonance imaging data and genetic data available. The experimental results show that T-MTSCCA performs either better than or equally to the state-of-the-art methods. In particular, T-MTSCCA could identify higher canonical correlation coefficients and capture clearer canonical weight patterns. This suggests that T-MTSCCA identifies time-consistent and time-dependent SNPs and imaging QTs, which further help understand the genetic basis of the brain QT changes over the time during the disease progression.
Availability and implementation
The software and simulation data are publicly available at https://github.com/dulei323/TMTSCCA.
Supplementary information
Supplementary data are available at Bioinformatics online
Sparse reduced-rank regression for imaging genetics studies: models and applications
We present a novel statistical technique; the sparse reduced rank regression (sRRR) model
which is a strategy for multivariate modelling of high-dimensional imaging responses and
genetic predictors. By adopting penalisation techniques, the model is able to enforce sparsity
in the regression coefficients, identifying subsets of genetic markers that best explain
the variability observed in subsets of the phenotypes. To properly exploit the rich structure
present in each of the imaging and genetics domains, we additionally propose the use of
several structured penalties within the sRRR model. Using simulation procedures that accurately
reflect realistic imaging genetics data, we present detailed evaluations of the sRRR
method in comparison with the more traditional univariate linear modelling approach. In
all settings considered, we show that sRRR possesses better power to detect the deleterious
genetic variants. Moreover, using a simple genetic model, we demonstrate the potential
benefits, in terms of statistical power, of carrying out voxel-wise searches as opposed to
extracting averages over regions of interest in the brain. Since this entails the use of phenotypic
vectors of enormous dimensionality, we suggest the use of a sparse classification
model as a de-noising step, prior to the imaging genetics study. Finally, we present the
application of a data re-sampling technique within the sRRR model for model selection.
Using this approach we are able to rank the genetic markers in order of importance of association
to the phenotypes, and similarly rank the phenotypes in order of importance to
the genetic markers. In the very end, we illustrate the application perspective of the proposed
statistical models in three real imaging genetics datasets and highlight some potential
associations
Fast Identification of Biological Pathways Associated with a Quantitative Trait Using Group Lasso with Overlaps
Where causal SNPs (single nucleotide polymorphisms) tend to accumulate within
biological pathways, the incorporation of prior pathways information into a
statistical model is expected to increase the power to detect true associations
in a genetic association study. Most existing pathways-based methods rely on
marginal SNP statistics and do not fully exploit the dependence patterns among
SNPs within pathways. We use a sparse regression model, with SNPs grouped into
pathways, to identify causal pathways associated with a quantitative trait.
Notable features of our "pathways group lasso with adaptive weights" (P-GLAW)
algorithm include the incorporation of all pathways in a single regression
model, an adaptive pathway weighting procedure that accounts for factors
biasing pathway selection, and the use of a bootstrap sampling procedure for
the ranking of important pathways. P-GLAW takes account of the presence of
overlapping pathways and uses a novel combination of techniques to optimise
model estimation, making it fast to run, even on whole genome datasets. In a
comparison study with an alternative pathways method based on univariate SNP
statistics, our method demonstrates high sensitivity and specificity for the
detection of important pathways, showing the greatest relative gains in
performance where marginal SNP effect sizes are small.Comment: 29 page
- ā¦