25 research outputs found
Recommended from our members
Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts.
It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene1. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches2-5. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases6-8. This includes muscle biopsies from patients with undiagnosed rare muscle disorders6,9, and cultured fibroblasts from patients with mitochondrial disorders7. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution
Genetic effects on gene expression across human tissues
Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of diseas
Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes
Penetrance of variants in monogenic disease and clinical utility of common polygenic variation has not been well explored on a large-scale. Here, the authors use exome sequencing data from 77,184 individuals to generate penetrance estimates and assess the utility of polygenic variation in risk prediction of monogenic variants
Population- and individual-specific regulatory variation in Sardinia
Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.M.P. is supported by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement 633964 (ImmunoAgeing). Z.Z. is supported by the National Science Foundation (NSF) GRFP (DGE- 114747) and by the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG). Z.Z., J.R.D., and G.T.H. also acknowledge support from the Stanford Genome Training Program (SGTP; NIH/NHGRI T32HG000044). J.R.D. is supported by the Stanford Graduate Fellowship. K.R.K. is supported by Department of Defense, Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEQ) Fellowship 32 CFR 168a. S.J.S. is supported by the NIHR Cambridge Biomedical Research Centre. The SardiNIA project is supported in part by the intramural program of the National Institute on Aging through contract HHSN271201100005C to the Consiglio Nazionale delle Ricerche of Italy. The RNA sequencing was supported by the PB05 InterOmics MIUR Flagship grant; by the FaReBio2011 “Farmaci e Reti Biotecnologiche di Qualità” grant; and by Sardinian Autonomous Region (L.R. no. 7/2009) grant cRP3-154 to F. Cucca, who is also supported by the Italian Foundation for Multiple Sclerosis (FISM 2015/R/09) and by the Fondazione di Sardegna (ex Fondazione Banco di Sardegna, Prot. U1301.2015/AI.1157.BE Prat. 2015-1651). S.B.M. is supported by the US National Institutes of Health through R01HG008150, R01MH101814, U01HG007436, and U01HG009080. All of the authors would like to thank the CRS4 and the SCGPM for the computational infrastructure supporting this project
Genetic effects on gene expression across human tissues
Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease
Recommended from our members
Impact Of The X Chromosome And Sex On Regulatory Variation
The unique mode of inheritance and regulatory mechanisms of the X chromosome has resulted in distinct patterns of evolution that shape its genetic architecture and the impact of genetic variation between the sexes. Due to these characteristics, however, the X chromosome has often been excluded from genetic analyses. We characterize the impact of the X chromosome and sex on human regulatory variation through analysis of genetic and gene expression data in a cohort of 922 individuals (whole blood RNA-sequencing from 274 males and 648 females). We identify higher variance in gene expression on the X chromosome compared to the autosomes and that differences in variance are more likely to be sex-specific on X due to the hemizygous exposure of cis regulatory variation in males. Furthermore, we identify that cis-expression quantitative trait loci (e@QTL) have weaker effects and influence fewer genes on the X chromosome compared to the autosomes, especially among genes with strong purifying selection. Despite this, we discover a higher proportion of sex-specific eQTLs on the X chromosome compared to autosomes. To subsequently identify the molecular mechanisms underlying discovered sex-specific eQTLs, we generate and connect sex-specific chromatin accessibility (ATAC-seq) to sex-specific expression and eQTL. Furthermore, as sex-specific eQTL can inform sex-specific effects of genetic variation on disease, we integrate eQTL with genome-wide association study data for multiple immune traits to identify sex-specific effect sizes for multiple trait loci. Together, our study provides a genome-wide understanding of how the X chromosome and sex shape human gene regulation and disease
Recommended from our members
Colocalization of Gene Expression and DNA Methylation with Genetic Risk Variants Supports Functional Roles of MUC5B and DSP in Idiopathic Pulmonary Fibrosis.
Rationale: Common genetic variants have been associated with idiopathic pulmonary fibrosis (IPF). Objectives: To determine functional relevance of the 10 IPF-associated common genetic variants we previously identified. Methods: We performed expression quantitative trait loci (eQTL) and methylation quantitative trait loci (mQTL) mapping, followed by co-localization of eQTL and mQTL with genetic association signals and functional validation by luciferase reporter assays. Illumina multi-ethnic genotyping arrays, mRNA sequencing, and Illumina 850k methylation arrays were performed on lung tissue of participants with IPF (234 RNA and 345 DNA samples) and non-diseased controls (188 RNA and 202 DNA samples). Measurements and Main Results: Focusing on genetic variants within 10 IPF-associated genetic loci, we identified 27 eQTLs in controls and 24 eQTLs in cases (false-discovery-rate-adjusted P < 0.05). Among these signals, we identified associations of lead variants rs35705950 with expression of MUC5B and rs2076295 with expression of DSP in both cases and controls. mQTL analysis identified CpGs in gene bodies of MUC5B (cg17589883) and DSP (cg08964675) associated with the lead variants in these two loci. We also demonstrated strong co-localization of eQTL/mQTL and genetic signal in MUC5B (rs35705950) and DSP (rs2076295). Functional validation of the mQTL in MUC5B using luciferase reporter assays demonstrates that the CpG resides within a putative internal repressor element. Conclusions: We have established a relationship of the common IPF genetic risk variants rs35705950 and rs2076295 with respective changes in MUC5B and DSP expression and methylation. These results provide additional evidence that both MUC5B and DSP are involved in the etiology of IPF
Recommended from our members
Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts.
It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene1. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches2-5. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases6-8. This includes muscle biopsies from patients with undiagnosed rare muscle disorders6,9, and cultured fibroblasts from patients with mitochondrial disorders7. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution
The mutational constraint spectrum quantified from variation in 141,456 humans
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.publishedVersionPeer reviewe
The impact of rare variation on gene expression across tissues
Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes