181 research outputs found

    Developing Predictive Molecular Maps of Human Disease through Community-based Modeling

    Get PDF
    The failure of biology to identify the molecular causes of disease has led to disappointment in the rate of development of new medicines. By combining the power of community-based modeling with broad access to large datasets on a platform that promotes reproducible analyses we can work towards more predictive molecular maps that can deliver better therapeutics

    Approaches to mapping genetically correlated complex traits

    Get PDF
    Our Markov chain Monte Carlo (MCMC) methods were used in linkage analyses of the Framingham Heart Study data using all available pedigrees. Our goal was to detect and map loci associated with covariate-adjusted traits log triglyceride (lnTG) and high-density lipoprotein cholesterol (HDL) using multipoint LOD score analysis, Bayesian oligogenic linkage analysis and identity-by-descent (IBD) scoring methods. Each method used all marker data for all markers on a chromosome. Bayesian linkage analysis detected a linkage signal on chromosome 7 for lnTG and HDL, corroborating previously published results. However, these results were not replicated in a classical linkage analysis of the data or by using IBD scoring methods. We conclude that Bayesian linkage analysis provides a powerful paradigm for mapping trait loci but interpretation of the Bayesian linkage signals is subjective. In the absence of a LOD score method accommodating genetically complex traits and linkage heterogeneity, validation of these signals remains elusive

    Identifying and ranking potential driver genes of Alzheimer\u27s disease using multiview evidence aggregation.

    Get PDF
    MOTIVATION: Late onset Alzheimer\u27s disease is currently a disease with no known effective treatment options. To better understand disease, new multi-omic data-sets have recently been generated with the goal of identifying molecular causes of disease. However, most analytic studies using these datasets focus on uni-modal analysis of the data. Here, we propose a data driven approach to integrate multiple data types and analytic outcomes to aggregate evidences to support the hypothesis that a gene is a genetic driver of the disease. The main algorithmic contributions of our article are: (i) a general machine learning framework to learn the key characteristics of a few known driver genes from multiple feature sets and identifying other potential driver genes which have similar feature representations, and (ii) A flexible ranking scheme with the ability to integrate external validation in the form of Genome Wide Association Study summary statistics. While we currently focus on demonstrating the effectiveness of the approach using different analytic outcomes from RNA-Seq studies, this method is easily generalizable to other data modalities and analysis types. RESULTS: We demonstrate the utility of our machine learning algorithm on two benchmark multiview datasets by significantly outperforming the baseline approaches in predicting missing labels. We then use the algorithm to predict and rank potential drivers of Alzheimer\u27s. We show that our ranked genes show a significant enrichment for single nucleotide polymorphisms associated with Alzheimer\u27s and are enriched in pathways that have been previously associated with the disease. AVAILABILITY AND IMPLEMENTATION: Source code and link to all feature sets is available at https://github.com/Sage-Bionetworks/EvidenceAggregatedDriverRanking

    Mapping the genetic architecture of gene expression in human liver

    Get PDF
    Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process. © 2008 Schadt et al

    Molecular estimation of neurodegeneration pseudotime in older brains.

    Get PDF
    The temporal molecular changes that lead to disease onset and progression in Alzheimer\u27s disease (AD) are still unknown. Here we develop a temporal model for these unobserved molecular changes with a manifold learning method applied to RNA-Seq data collected from human postmortem brain samples collected within the ROS/MAP and Mayo Clinic RNA-Seq studies. We define an ordering across samples based on their similarity in gene expression and use this ordering to estimate the molecular disease stage-or disease pseudotime-for each sample. Disease pseudotime is strongly concordant with the burden of tau (Braak score, P = 1.0 × 10-5), Aβ (CERAD score, P = 1.8 × 10-5), and cognitive diagnosis (P = 3.5 × 10-7) of late-onset (LO) AD. Early stage disease pseudotime samples are enriched for controls and show changes in basic cellular functions. Late stage disease pseudotime samples are enriched for late stage AD cases and show changes in neuroinflammation and amyloid pathologic processes. We also identify a set of late stage pseudotime samples that are controls and show changes in genes enriched for protein trafficking, splicing, regulation of apoptosis, and prevention of amyloid cleavage pathways. In summary, we present a method for ordering patients along a trajectory of LOAD disease progression from brain transcriptomic data

    Integrating genomic analysis with the genetic basis of gene expression: Preliminary Evidence of the Identification of causal genes for cardiovascular and metabolic traits related to nutrition in mexicans1–3

    Get PDF
    Whole-transcriptome expression profiling provides novel phenotypes for analysis of complex traits. Gene expression measurements reflect quantitative variation in transcript-specific messenger RNA levels and represent phenotypes lying close to the action of genes. Understanding the genetic basis of gene expression will provide insight into the processes that connect genotype to clinically significant traits representing a central tenet of system biology. Synchronous in vivo expression profiles of lymphocytes, muscle, and subcutaneous fat were obtained from healthy Mexican men. Most genes were expressed at detectable levels in multiple tissues, and RNA levels were correlated between tissue types. A subset of transcripts with high reliability of expression across tissues (estimated by intraclass correlation coefficients) was enriched for cis-regulated genes, suggesting that proximal sequence variants may influence expression similarly in different cellular environments. This integrative global gene expression profiling approach is proving extremely useful for identifying genes and pathways that contribute to complex clinical traits. Clearly, the coincidence of clinical trait quantitative trait loci and expression quantitative trait loci can help in the prioritization of positional candidate genes. Such data will be crucial for the formal integration of positional and transcriptomic information characterized as genetical genomics.

    Genome-Wide Linkage and Admixture Mapping of Type 2 Diabetes in African American Families From the American Diabetes Association GENNID (Genetics of NIDDM) Study Cohort

    Get PDF
    OBJECTIVE—We used a single nucleotide polymorphism (SNP) map in a large cohort of 580 African American families to identify regions linked to type 2 diabetes, age of type 2 diabetes diagnosis, and BMI

    Young people's data governance preferences for their mental health data: MindKind Study findings from India, South Africa, and the United Kingdom

    Get PDF
    Mobile devices offer a scalable opportunity to collect longitudinal data that facilitate advances in mental health treatment to address the burden of mental health conditions in young people. Sharing these data with the research community is critical to gaining maximal value from rich data of this nature. However, the highly personal nature of the data necessitates understanding the conditions under which young people are willing to share them. To answer this question, we developed the MindKind Study, a multinational, mixed methods study that solicits young people's preferences for how their data are governed and quantifies potential participants' willingness to join under different conditions. We employed a community-based participatory approach, involving young people as stakeholders and co-researchers. At sites in India, South Africa, and the UK, we enrolled 3575 participants ages 16-24 in the mobile app-mediated quantitative study and 143 participants in the public deliberation-based qualitative study. We found that while youth participants have strong preferences for data governance, these preferences did not translate into (un)willingness to join the smartphone-based study. Participants grappled with the risks and benefits of participation as well as their desire that the "right people" access their data. Throughout the study, we recognized young people's commitment to finding solutions and co-producing research architectures to allow for more open sharing of mental health data to accelerate and derive maximal benefit from research
    corecore