21 research outputs found
Recommended from our members
Development and validation of a trans-ancestry polygenic risk score for type 2 diabetes in diverse populations
Background
Type 2 diabetes (T2D) is a worldwide scourge caused by both genetic and environmental risk factors that disproportionately afflicts communities of color. Leveraging existing large-scale genome-wide association studies (GWAS), polygenic risk scores (PRS) have shown promise to complement established clinical risk factors and intervention paradigms, and improve early diagnosis and prevention of T2D. However, to date, T2D PRS have been most widely developed and validated in individuals of European descent. Comprehensive assessment of T2D PRS in non-European populations is critical for equitable deployment of PRS to clinical practice that benefits global populations.
Methods
We integrated T2D GWAS in European, African, and East Asian populations to construct a trans-ancestry T2D PRS using a newly developed Bayesian polygenic modeling method, and assessed the prediction accuracy of the PRS in the multi-ethnic Electronic Medical Records and Genomics (eMERGE) study (11,945 cases; 57,694 controls), four Black cohorts (5137 cases; 9657 controls), and the Taiwan Biobank (4570 cases; 84,996 controls). We additionally evaluated a post hoc ancestry adjustment method that can express the polygenic risk on the same scale across ancestrally diverse individuals and facilitate the clinical implementation of the PRS in prospective cohorts.
Results
The trans-ancestry PRS was significantly associated with T2D status across the ancestral groups examined. The top 2% of the PRS distribution can identify individuals with an approximately 2.5–4.5-fold of increase in T2D risk, which corresponds to the increased risk of T2D for first-degree relatives. The post hoc ancestry adjustment method eliminated major distributional differences in the PRS across ancestries without compromising its predictive performance.
Conclusions
By integrating T2D GWAS from multiple populations, we developed and validated a trans-ancestry PRS, and demonstrated its potential as a meaningful index of risk among diverse patients in clinical settings. Our efforts represent the first step towards the implementation of the T2D PRS into routine healthcare
Genetic predictors of blood pressure traits are associated with preeclampsia
Abstract Preeclampsia, a pregnancy complication characterized by hypertension after 20 gestational weeks, is a major cause of maternal and neonatal morbidity and mortality. Mechanisms leading to preeclampsia are unclear; however, there is evidence of high heritability. We evaluated the association of polygenic scores (PGS) for blood pressure traits and preeclampsia to assess whether there is shared genetic architecture. Non-Hispanic Black and White reproductive age females with pregnancy indications and genotypes were obtained from Vanderbilt University’s BioVU, Electronic Medical Records and Genomics network, and Penn Medicine Biobank. Preeclampsia was defined by ICD codes. Summary statistics for diastolic blood pressure (DBP), systolic blood pressure (SBP), and pulse pressure (PP) PGS were acquired from Giri et al. Associations between preeclampsia and each PGS were evaluated separately by race and data source before subsequent meta-analysis. Ten-fold cross validation was used for prediction modeling. In 3504 Black and 5009 White included individuals, the rate of preeclampsia was 15.49%. In cross-ancestry meta-analysis, all PGSs were associated with preeclampsia (ORDBP = 1.10, 95% CI 1.02–1.17, p = 7.68 × 10−3; ORSBP = 1.16, 95% CI 1.09–1.23, p = 2.23 × 10−6; ORPP = 1.14, 95% CI 1.07–1.27, p = 9.86 × 10−5). Addition of PGSs to clinical prediction models did not improve predictive performance. Genetic factors contributing to blood pressure regulation in the general population also predispose to preeclampsia
Evidence of epistasis in regions of long-range linkage disequilibrium across five complex diseases in the UK Biobank and eMERGE datasets.
peer reviewedLeveraging linkage disequilibrium (LD) patterns as representative of population substructure enables the discovery of additive association signals in genome-wide association studies (GWASs). Standard GWASs are well-powered to interrogate additive models; however, new approaches are required for invesigating other modes of inheritance such as dominance and epistasis. Epistasis, or non-additive interaction between genes, exists across the genome but often goes undetected because of a lack of statistical power. Furthermore, the adoption of LD pruning as customary in standard GWASs excludes detection of sites that are in LD but might underlie the genetic architecture of complex traits. We hypothesize that uncovering long-range interactions between loci with strong LD due to epistatic selection can elucidate genetic mechanisms underlying common diseases. To investigate this hypothesis, we tested for associations between 23 common diseases and 5,625,845 epistatic SNP-SNP pairs (determined by Ohta's D statistics) in long-range LD (>0.25 cM). Across five disease phenotypes, we identified one significant and four near-significant associations that replicated in two large genotype-phenotype datasets (UK Biobank and eMERGE). The genes that were most likely involved in the replicated associations were (1) members of highly conserved gene families with complex roles in multiple pathways, (2) essential genes, and/or (3) genes that were associated in the literature with complex traits that display variable expressivity. These results support the highly pleiotropic and conserved nature of variants in long-range LD under epistatic selection. Our work supports the hypothesis that epistatic interactions regulate diverse clinical mechanisms and might especially be driving factors in conditions with a wide range of phenotypic outcomes
Recommended from our members
Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions
Introduction
Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness.
Methods
This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm.
Results
We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites.
Discussion and conclusion
Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms
Genetic investigation of fibromuscular dysplasia identifies risk loci and shared genetics with common cardiovascular diseases
International audienceFibromuscular dysplasia (FMD) is an arteriopathy associated with hypertension, stroke and myocardial infarction, affecting mostly women. We report results from the first genome-wide association meta-analysis of six studies including 1556 FMD cases and 7100 controls. We find an estimate of SNP-based heritability compatible with FMD having a polygenic basis, and report four robustly associated loci (PHACTR1, LRP1, ATP2B1, and LIMA1). Transcriptome-wide association analysis in arteries identifies one additional locus (SLC24A3). We characterize open chromatin in arterial primary cells and find that FMD associated variants are located in arterial-specific regulatory elements. Target genes are broadly involved in mechanisms related to actin cytoskeleton and intracellular calcium homeostasis, central to vascular contraction. We find significant genetic overlap between FMD and more common cardiovascular diseases and traits including blood pressure, migraine, intracranial aneurysm, and coronary artery disease
Genome-wide association meta-analysis identifies risk loci for abdominal aortic aneurysm and highlights PCSK9 as a therapeutic target
Abdominal aortic aneurysm (AAA) is a common disease with substantial heritability. In this study, we performed a genome-wide association meta-analysis from 14 discovery cohorts and uncovered 141 independent associations, including 97 previously unreported loci. A polygenic risk score derived from meta-analysis explained AAA risk beyond clinical risk factors. Genes at AAA risk loci indicate involvement of lipid metabolism, vascular development and remodeling, extracellular matrix dysregulation and inflammation as key mechanisms in AAA pathogenesis. These genes also indicate overlap between the development of AAA and other monogenic aortopathies, particularly via transforming growth factor β signaling. Motivated by the strong evidence for the role of lipid metabolism in AAA, we used Mendelian randomization to establish the central role of nonhigh-density lipoprotein cholesterol in AAA and identified the opportunity for repurposing of proprotein convertase, subtilisin/kexin-type 9 (PCSK9) inhibitors. This was supported by a study demonstrating that PCSK9 loss of function prevented the development of AAA in a preclinical mouse model. Genome-wide association meta-analysis of AAA identifies 121 independent risk loci and highlights potential therapeutic targets such as proprotein convertase, subtilisin/kexin-type 9 (PCSK9)
Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network
Abstract The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations