69 research outputs found

    Network reconstruction for trans acting genetic loci using multi-omics data and prior information

    Get PDF
    BACKGROUND: Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks regulating cellular processes leave footprints in QTL results as so-called trans-QTL hotspots. Reconstructing these networks is a complex endeavor and use of biological prior information can improve network inference. However, previous efforts were limited in the types of priors used or have only been applied to model systems. In this study, we reconstruct the regulatory networks underlying trans-QTL hotspots using human cohort data and data-driven prior information. METHODS: We devised a new strategy to integrate QTL with human population scale multi-omics data. State-of-the art network inference methods including BDgraph and glasso were applied to these data. Comprehensive prior information to guide network inference was manually curated from large-scale biological databases. The inference approach was extensively benchmarked using simulated data and cross-cohort replication analyses. Best performing methods were subsequently applied to real-world human cohort data. RESULTS: Our benchmarks showed that prior-based strategies outperform methods without prior information in simulated data and show better replication across datasets. Application of our approach to human cohort data highlighted two novel regulatory networks related to schizophrenia and lean body mass for which we generated novel functional hypotheses. CONCLUSIONS: We demonstrate that existing biological knowledge can improve the integrative analysis of networks underlying trans associations and generate novel hypotheses about regulatory mechanisms

    Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts.

    Get PDF
    It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene1. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches2-5. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases6-8. This includes muscle biopsies from patients with undiagnosed rare muscle disorders6,9, and cultured fibroblasts from patients with mitochondrial disorders7. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution

    The impact of sex on gene expression across human tissues

    Full text link
    Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation

    Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes

    Get PDF
    Type 2 diabetes (T2D) is a very common disease in humans. Here we conduct a meta-analysis of genome-wide association studies (GWAS) with ~16 million genetic variants in 62,892 T2D cases and 596,424 controls of European ancestry. We identify 139 common and 4 rare variants associated with T2D, 42 of which (39 common and 3 rare variants) are independent of the known variants. Integration of the gene expression data from blood (n = 14,115 and 2765) with the GWAS results identifies 33 putative functional genes for T2D, 3 of which were targeted by approved drugs. A further integration of DNA methylation (n = 1980) and epigenomic annotation data highlight 3 genes (CAMK1D, TP53INP1, and ATP5G1) with plausible regulatory mechanisms, whereby a genetic variant exerts an effect on T2D through epigenetic regulation of gene expression. Our study uncovers additional loci, proposes putative genetic regulatory mechanisms for T2D, and provides evidence of purifying selection for T2D-associated variants

    Mendelian randomization integrating GWAS and eQTL data reveals genetic determinants of complex and clinical traits

    Get PDF
    Genome-wide association studies (GWAS) have identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap with expression QTLs, indicating their potential involvement in regulation of gene expression. Here, we propose a transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) that uses multiple SNPs as instruments and multiple gene expression traits as exposures, simultaneously. Applied to 43 human phenotypes, it uncovers 3,913 putatively causal gene-trait associations, 36% of which have no genome-wide significant SNP nearby in previous GWAS. Using independent association summary statistics, we find that the majority of these loci were missed by GWAS due to power issues. Noteworthy among these links is educational attainment-associated BSCL2, known to carry mutations leading to a Mendelian form of encephalopathy. We also find pleiotropic causal effects suggestive of mechanistic connections. TWMR better accounts for pleiotropy and has the potential to identify biological mechanisms underlying complex traits

    Genetic effects on gene expression across human tissues

    Get PDF
    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of diseas

    Genetic determinants of telomere length from 109,122 ancestrally diverse whole-genome sequences in TOPMed

    Get PDF
    Genetic studies on telomere length are important for understanding age-related diseases. Prior GWAS for leukocyte TL have been limited to European and Asian populations. Here, we report the first sequencing-based association study for TL across ancestrally-diverse individuals (European, African, Asian and Hispanic/Latino) from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. We used whole genome sequencing (WGS) of whole blood for variant genotype calling and the bioinformatic estimation of telomere length in n=109,122 individuals. We identified 59 sentinel variants (p-value OBFC1indicated the independent signals colocalized with cell-type specific eQTLs for OBFC1 (STN1). Using a multi-variant gene-based approach, we identified two genes newly implicated in telomere length, DCLRE1B (SNM1B) and PARN. In PheWAS, we demonstrated our TL polygenic trait scores (PTS) were associated with increased risk of cancer-related phenotypes

    Refining Attention-Deficit/Hyperactivity Disorder and Autism Spectrum Disorder Genetic Loci by Integrating Summary Data From Genome-wide Association, Gene Expression, and DNA Methylation Studies

    Get PDF
    Background: Recent genome-wide association studies (GWASs) identified the first genetic loci associated with attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). The next step is to use these results to increase our understanding of the biological mechanisms involved. Most of the identified variants likely influence gene regulation. The aim of the current study is to shed light on the mechanisms underlying the genetic signals and prioritize genes by integrating GWAS results with gene expression and DNA methylation (DNAm) levels. Methods: We applied summary-data–based Mendelian randomization to integrate ADHD and ASD GWAS data with fetal brain expression and methylation quantitative trait loci, given the early onset of these disorders. We also analyzed expression and methylation quantitative trait loci datasets of adult brain and blood, as these provide increased statistical power. We subsequently used summary-data–based Mendelian randomization to investigate if the same variant influences both DNAm and gene expression levels. Results: We identified multiple gene expression and DNAm levels in fetal brain at chromosomes 1 and 17 that were associated with ADHD and ASD, respectively, through pleiotropy at shared genetic variants. The analyses in brain and blood showed additional associated gene expression and DNAm levels at the same and additional loci, likely because of increased statistical power. Several of the associated genes have not been identified in ADHD and ASD GWASs before. Conclusions: Our findings identified the genetic variants associated with ADHD and ASD that likely act through gene regulation. This facilitates prioritization of candidate genes for functional follow-up studies
    • 

    corecore