106 research outputs found

    Developing Predictive Molecular Maps of Human Disease through Community-based Modeling

    Get PDF
    The failure of biology to identify the molecular causes of disease has led to disappointment in the rate of development of new medicines. By combining the power of community-based modeling with broad access to large datasets on a platform that promotes reproducible analyses we can work towards more predictive molecular maps that can deliver better therapeutics

    Identifying and ranking potential driver genes of Alzheimer\u27s disease using multiview evidence aggregation.

    Get PDF
    MOTIVATION: Late onset Alzheimer\u27s disease is currently a disease with no known effective treatment options. To better understand disease, new multi-omic data-sets have recently been generated with the goal of identifying molecular causes of disease. However, most analytic studies using these datasets focus on uni-modal analysis of the data. Here, we propose a data driven approach to integrate multiple data types and analytic outcomes to aggregate evidences to support the hypothesis that a gene is a genetic driver of the disease. The main algorithmic contributions of our article are: (i) a general machine learning framework to learn the key characteristics of a few known driver genes from multiple feature sets and identifying other potential driver genes which have similar feature representations, and (ii) A flexible ranking scheme with the ability to integrate external validation in the form of Genome Wide Association Study summary statistics. While we currently focus on demonstrating the effectiveness of the approach using different analytic outcomes from RNA-Seq studies, this method is easily generalizable to other data modalities and analysis types. RESULTS: We demonstrate the utility of our machine learning algorithm on two benchmark multiview datasets by significantly outperforming the baseline approaches in predicting missing labels. We then use the algorithm to predict and rank potential drivers of Alzheimer\u27s. We show that our ranked genes show a significant enrichment for single nucleotide polymorphisms associated with Alzheimer\u27s and are enriched in pathways that have been previously associated with the disease. AVAILABILITY AND IMPLEMENTATION: Source code and link to all feature sets is available at https://github.com/Sage-Bionetworks/EvidenceAggregatedDriverRanking

    Molecular estimation of neurodegeneration pseudotime in older brains.

    Get PDF
    The temporal molecular changes that lead to disease onset and progression in Alzheimer\u27s disease (AD) are still unknown. Here we develop a temporal model for these unobserved molecular changes with a manifold learning method applied to RNA-Seq data collected from human postmortem brain samples collected within the ROS/MAP and Mayo Clinic RNA-Seq studies. We define an ordering across samples based on their similarity in gene expression and use this ordering to estimate the molecular disease stage-or disease pseudotime-for each sample. Disease pseudotime is strongly concordant with the burden of tau (Braak score, P = 1.0 × 10-5), Aβ (CERAD score, P = 1.8 × 10-5), and cognitive diagnosis (P = 3.5 × 10-7) of late-onset (LO) AD. Early stage disease pseudotime samples are enriched for controls and show changes in basic cellular functions. Late stage disease pseudotime samples are enriched for late stage AD cases and show changes in neuroinflammation and amyloid pathologic processes. We also identify a set of late stage pseudotime samples that are controls and show changes in genes enriched for protein trafficking, splicing, regulation of apoptosis, and prevention of amyloid cleavage pathways. In summary, we present a method for ordering patients along a trajectory of LOAD disease progression from brain transcriptomic data

    A genome-wide association analysis of temozolomide response using lymphoblastoid cell lines reveals a clinically relevant association with MGMT

    Get PDF
    Recently, lymphoblastoid cell lines (LCLs) have emerged as an innovative model system for mapping gene variants that predict dose response to chemotherapy drugs. In the current study, this strategy was expanded to the in vitro genome-wide association approach, using 516 LCLs derived from a Caucasian cohort to assess cytotoxic response to temozolomide. Genome-wide association analysis using approximately 2.1 million quality controlled single-nucleotide polymorphisms (SNPs) identified a statistically significant association (p < 10−8) with SNPs in the O6-methylguanine–DNA methyltransferase (MGMT) gene. We also demonstrate that the primary SNP in this region is significantly associated with differential gene expression of MGMT (p< 10−26) in LCLs, and differential methylation in glioblastoma samples from The Cancer Genome Atlas. The previously documented clinical and functional relationships between MGMT and temozolomide response highlight the potential of well-powered GWAS of the LCL model system to identify meaningful genetic associations
    corecore