18 research outputs found
The genetic determinants of recurrent somatic mutations in 43,693 blood genomes
Nononcogenic somatic mutations are thought to be uncommon and inconsequential. To test this, we analyzed 43,693 National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine blood whole genomes from 37 cohorts and identified 7131 non-missense somatic mutations that are recurrently mutated in at least 50 individuals. These recurrent non-missense somatic mutations (RNMSMs) are not clearly explained by other clonal phenomena such as clonal hematopoiesis. RNMSM prevalence increased with age, with an average 50-year-old having 27 RNMSMs. Inherited germline variation associated with RNMSM acquisition. These variants were found in genes involved in adaptive immune function, proinflammatory cytokine production, and lymphoid lineage commitment. In addition, the presence of eight specific RNMSMs associated with blood cell traits at effect sizes comparable to Mendelian genetic mutations. Overall, we found that somatic mutations in blood are an unexpectedly common phenomenon with ancestry-specific determinants and human health consequences
Recommended from our members
Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale.
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol
Whole-exome sequence analysis of anthropometric traits illustrates challenges in identifying effects of rare genetic variants
Anthropometric traits, measuring body size and shape, are highly heritable and significant clinical risk factors for cardiometabolic disorders. These traits have been extensively studied in genome-wide association studies (GWASs), with hundreds of genome-wide significant loci identified. We performed a whole-exome sequence analysis of the genetics of height, body mass index (BMI) and waist/hip ratio (WHR). We meta-analyzed single-variant and gene-based associations of whole-exome sequence variation with height, BMI, and WHR in up to 22,004 individuals, and we assessed replication of our findings in up to 16,418 individuals from 10 independent cohorts from Trans-Omics for Precision Medicine (TOPMed). We identified four trait associations with single-nucleotide variants (SNVs; two for height and two for BMI) and replicated the LECT2 gene association with height. Our expression quantitative trait locus (eQTL) analysis within previously reported GWAS loci implicated CEP63 and RFT1 as potential functional genes for known height loci. We further assessed enrichment of SNVs, which were monogenic or syndromic variants within loci associated with our three traits. This led to the significant enrichment results for height, whereas we observed no Bonferroni-corrected significance for all SNVs. With a sample size of ∼20,000 whole-exome sequences in our discovery dataset, our findings demonstrate the importance of genomic sequencing in genetic association studies, yet they also illustrate the challenges in identifying effects of rare genetic variants
Recommended from our members
Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale.
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol
Recommended from our members
Inherited causes of clonal haematopoiesis in 97,691 whole genomes.
Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown1. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer2-4 and coronary heart disease5-this phenomenon is termed clonal haematopoiesis of indeterminate potential (CHIP)6. Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIP driver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues
Recommended from our members
A System for Phenotype Harmonization in the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program
Genotype-phenotype association studies often combine phenotype data from multiple studies to increase statistical power. Harmonization of the data usually requires substantial effort due to heterogeneity in phenotype definitions, study design, data collection procedures, and data-set organization. Here we describe a centralized system for phenotype harmonization that includes input from phenotype domain and study experts, quality control, documentation, reproducible results, and data-sharing mechanisms. This system was developed for the National Heart, Lung, and Blood Institute's Trans-Omics for Precision Medicine (TOPMed) program, which is generating genomic and other -omics data for more than 80 studies with extensive phenotype data. To date, 63 phenotypes have been harmonized across thousands of participants (recruited in 1948-2012) from up to 17 studies per phenotype. Here we discuss challenges in this undertaking and how they were addressed. The harmonized phenotype data and associated documentation have been submitted to National Institutes of Health data repositories for controlled access by the scientific community. We also provide materials to facilitate future harmonization efforts by the community, which include 1) the software code used to generate the 63 harmonized phenotypes, enabling others to reproduce, modify, or extend these harmonizations to additional studies, and 2) the results of labeling thousands of phenotype variables with controlled vocabulary terms
Recommended from our members
Whole Genome Sequence Analysis of the Plasma Proteome in Black Adults Provides Novel Insights Into Cardiovascular Disease
BackgroundPlasma proteins are critical mediators of cardiovascular processes and are the targets of many drugs. Previous efforts to characterize the genetic architecture of the plasma proteome have been limited by a focus on individuals of European descent and leveraged genotyping arrays and imputation. Here we describe whole genome sequence analysis of the plasma proteome in individuals with greater African ancestry, increasing our power to identify novel genetic determinants.MethodsProteomic profiling of 1301 proteins was performed in 1852 Black adults from the Jackson Heart Study using aptamer-based proteomics (SomaScan). Whole genome sequencing association analysis was ascertained for all variants with minor allele count ≥5. Results were validated using an alternative, antibody-based, proteomic platform (Olink) as well as replicated in the Multi-Ethnic Study of Atherosclerosis and the HERITAGE Family Study (Health, Risk Factors, Exercise Training and Genetics).ResultsWe identify 569 genetic associations between 479 proteins and 438 unique genetic regions at a Bonferroni-adjusted significance level of 3.8×10-11. These associations include 114 novel locus-protein relationships and an additional 217 novel sentinel variant-protein relationships. Novel cardiovascular findings include new protein associations at the APOE gene locus including ZAP70 (sentinel single nucleotide polymorphism [SNP] rs7412-T, β=0.61±0.05, P=3.27×10-30) and MMP-3 (β=-0.60±0.05, P=1.67×10-32), as well as a completely novel pleiotropic locus at the HPX gene, associated with 9 proteins. Further, the associations suggest new mechanisms of genetically mediated cardiovascular disease linked to African ancestry; we identify a novel association between variants linked to APOL1-associated chronic kidney and heart disease and the protein CKAP2 (rs73885319-G, β=0.34±0.04, P=1.34×10-17) as well as an association between ATTR amyloidosis and RBP4 levels in community-dwelling individuals without heart failure.ConclusionsTaken together, these results provide evidence for the functional importance of variants in non-European populations, and suggest new biological mechanisms for ancestry-specific determinants of lipids, coagulation, and myocardial function
Aberrant activation of TCL1A promotes stem cell expansion in clonal haematopoiesis
Mutations in a diverse set of driver genes increase the fitness of haematopoietic stem cells (HSCs), leading to clonal haematopoiesis(1). These lesions are precursors for blood cancers(2-6), but the basis of their fitness advantage remains largely unknown, partly owing to a paucity of large cohorts in which the clonal expansion rate has been assessed by longitudinal sampling. Here, to circumvent this limitation, we developed a method to infer the expansion rate from data from a single time point. We applied this method to 5,071 people with clonal haematopoiesis. A genome-wide association study revealed that a common inherited polymorphism in the TCL1A promoter was associated with a slower expansion rate in clonal haematopoiesis overall, but the effect varied by driver gene. Those carrying this protective allele exhibited markedly reduced growth rates or prevalence of clones with driver mutations in TET2, ASXL1, SF3B1 and SRSF2, but this effect was not seen in clones with driver mutations in DNMT3A. TCL1A was not expressed in normal or DNMT3A-mutated HSCs, but the introduction of mutations in TET2 or ASXL1 led to the expression of TCL1A protein and the expansion of HSCs in vitro. The protective allele restricted TCL1A expression and expansion of mutant HSCs, as did experimental knockdown of TCL1A expression. Forced expression of TCL1A promoted the expansion of human HSCs in vitro and mouse HSCs in vivo. Our results indicate that the fitness advantage of several commonly mutated driver genes in clonal haematopoiesis may be mediated by TCL1A activation