64 research outputs found

    A data-driven medication score predicts 10-year mortality among aging adults

    Get PDF
    Health differences among the elderly and the role of medical treatments are topical issues in aging societies. We demonstrate the use of modern statistical learning methods to develop a data-driven health measure based on 21 years of pharmacy purchase and mortality data of 12,047 aging individuals. The resulting score was validated with 33,616 individuals from two fully independent datasets and it is strongly associated with all-cause mortality (HR 1.18 per point increase in score; 95% CI 1.14-1.22; p=2.25e-16). When combined with Charlson comorbidity index, individuals with elevated medication score and comorbidity index had over six times higher risk (HR 6.30; 95% CI 3.84-10.3; AUC=0.802) compared to individuals with a protective score profile. Alone, the medication score performs similarly to the Charlson comorbidity index and is associated with polygenic risk for coronary heart disease and type 2 diabetes.Peer reviewe

    A structural variation reference for medical and population genetics

    Get PDF
    Structural variants (SVs) rearrange large segments of DNA(1) and can have profound consequences in evolution and human disease(2,3). As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)(4) have become integral in the interpretation of single-nucleotide variants (SNVs)(5). However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage(6). We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings(7). This SV resource is freely distributed via the gnomAD browser(8) and will have broad utility in population genetics, disease-association studies, and diagnostic screening.Peer reviewe

    The mutational constraint spectrum quantified from variation in 141,456 humans

    Get PDF
    Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes(1). Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.Peer reviewe

    Evaluating drug targets through human loss-of-function genetic variation

    Get PDF
    Naturally occurring human genetic variants that are predicted to inactivate protein-coding genes provide an in vivo model of human gene inactivation that complements knockout studies in cells and model organisms. Here we report three key findings regarding the assessment of candidate drug targets using human loss-of-function variants. First, even essential genes, in which loss-of-function variants are not tolerated, can be highly successful as targets of inhibitory drugs. Second, in most genes, loss-of-function variants are sufficiently rare that genotype-based ascertainment of homozygous or compound heterozygous 'knockout' humans will await sample sizes that are approximately 1,000 times those presently available, unless recruitment focuses on consanguineous individuals. Third, automated variant annotation and filtering are powerful, but manual curation remains crucial for removing artefacts, and is a prerequisite for recall-by-genotype efforts. Our results provide a roadmap for human knockout studies and should guide the interpretation of loss-of-function variants in drug development.Peer reviewe

    Characterising the loss-of-function impact of 5' untranslated region variants in 15,708 individuals

    Get PDF
    Upstream open reading frames (uORFs) are tissue-specific cis-regulators of protein translation. Isolated reports have shown that variants that create or disrupt uORFs can cause disease. Here, in a systematic genome-wide study using 15,708 whole genome sequences, we show that variants that create new upstream start codons, and variants disrupting stop sites of existing uORFs, are under strong negative selection. This selection signal is significantly stronger for variants arising upstream of genes intolerant to loss-of-function variants. Furthermore, variants creating uORFs that overlap the coding sequence show signals of selection equivalent to coding missense variants. Finally, we identify specific genes where modification of uORFs likely represents an important disease mechanism, and report a novel uORF frameshift variant upstream of NF2 in neurofibromatosis. Our results highlight uORF-perturbing variants as an under-recognised functional class that contribute to penetrant human disease, and demonstrate the power of large-scale population sequencing data in studying non-coding variant classes. Upstream open reading frames (uORFs), located in 5' untranslated regions, are regulators of downstream protein translation. Here, Whiffin et al. use the genomes of 15,708 individuals in the Genome Aggregation Database (gnomAD) to systematically assess the deleteriousness of variants creating or disrupting uORFs.Peer reviewe

    Transcript expression-aware annotation improves rare variant interpretation

    Get PDF
    The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)(1), we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project(2) and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.Peer reviewe

    Cross-trait analyses with migraine reveal widespread pleiotropy and suggest a vascular component to migraine headache

    Get PDF
    Background: Nearly a fifth of the world's population suffer from migraine headache, yet risk factors for this disease are poorly characterized. Methods: To further elucidate these factors, we conducted a genetic correlation analysis using cross-trait linkage disequilibrium (LD) score regression between migraine headache and 47 traits from the UK Biobank. We then tested for possible causality between these phenotypes and migraine, using Mendelian randomization. In addition, we attempted replication of our findings in an independent genome-wide association study (GWAS) when available. Results: We report multiple phenotypes with genetic correlation (P < 1.06 × 10-3) with migraine, including heart disease, type 2 diabetes, lipid levels, blood pressure, autoimmune and psychiatric phenotypes. In particular, we find evidence that blood pressure directly contributes to migraine and explains a previously suggested causal relationship between calcium and migraine. Conclusions: This is the largest genetic correlation analysis of migraine headache to date, both in terms of migraine GWAS sample size and the number of phenotypes tested. We find that migraine has a shared genetic basis with a large number of traits, indicating pervasive pleiotropy at migraine-associated loci.Peer reviewe

    Migraine, inflammatory bowel disease and celiac disease:A Mendelian randomization study

    Get PDF
    Objective: To assess whether migraine may be genetically and/or causally associated with inflammatory bowel disease (IBD) or celiac disease. Background: Migraine has been linked to IBD and celiac disease in observational studies, but whether this link may be explained by a shared genetic basis or could be causal has not been established. The presence of a causal association could be clinically relevant, as treating one of these medical conditions might mitigate the symptoms of a causally linked condition. Methods:Linkage disequilibrium score regression and two-sample bidirectional Mendelian randomization analyses were performed using summary statistics from cohort-based genome-wide association studies of migraine (59,674 cases; 316,078 controls), IBD (25,042 cases; 34,915 controls) and celiac disease (11,812 or 4533 cases; 11,837 or 10,750 controls). Migraine with and without aura were analyzed separately, as were the two IBD subtypes Crohn's disease and ulcerative colitis. Positive control analyses and conventional Mendelian randomization sensitivity analyses were performed.Results: Migraine was not genetically correlated with IBD or celiac disease. No evidence was observed for IBD (odds ratio [OR] 1.00, 95% confidence interval [CI] 0.99–1.02, p = 0.703) or celiac disease (OR 1.00, 95% CI 0.99–1.02, p = 0.912) causing migraine or migraine causing either IBD (OR 1.08, 95% CI 0.96–1.22, p = 0.181) or celiac disease (OR 1.08, 95% CI 0.79–1.48, p = 0.614) when all participants with migraine were analyzed jointly. There was some indication of a causal association between celiac disease and migraine with aura (OR 1.04, 95% CI 1.00–1.08, p = 0.045), between celiac disease and migraine without aura (OR 0.95, 95% CI 0.92–0.99, p = 0.006), as well as between migraine without aura and ulcerative colitis (OR 1.15, 95% CI 1.02–1.29, p = 0.025). However, the results were not significant after multiple testing correction. Conclusions: We found no evidence of a shared genetic basis or of a causal association between migraine and either IBD or celiac disease, although we obtained some indications of causal associations with migraine subtypes.</p

    Common Variant Burden Contributes to the Familial Aggregation of Migraine in 1,589 Families

    Get PDF
    Complex traits, including migraine, often aggregate in families, but the underlying genetic architecture behind this is not well understood. The aggregation could be explained by rare, penetrant variants that segregate according to Mendelian inheritance or by the sufficient polygenic accumulation of common variants, each with an individually small effect, or a combination of the two hypotheses. In 8,319 individuals across 1,589 migraine families, we calculated migraine polygenic risk scores (PRS) and found a significantly higher common variant burden in familial cases (n = 5,317, OR = 1.76, 95% CI = 1.71-1.81, p = 1.7 × 10-109) compared to population cases from the FINRISK cohort (n = 1,101, OR = 1.32, 95% CI = 1.25-1.38, p = 7.2 × 10-17). The PRS explained 1.6% of the phenotypic variance in the population cases and 3.5% in the familial cases (including 2.9% for migraine without aura, 5.5% for migraine with typical aura, and 8.2% for hemiplegic migraine). The results demonstrate a significant contribution of common polygenic variation to the familial aggregation of migraine
    corecore