9 research outputs found

    Lessons learned and recommendations for data coordination in collaborative research: The CSER consortium experience

    Get PDF
    Integrating data across heterogeneous research environments is a key challenge in multi-site, collaborative research projects. While it is important to allow for natural variation in data collection protocols across research sites, it is also important to achieve interoperability between datasets in order to reap the full benefits of collaborative work. However, there are few standards to guide the data coordination process from project conception to completion. In this paper, we describe the experiences of the Clinical Sequence Evidence-Generating Research (CSER) consortium Data Coordinating Center (DCC), which coordinated harmonized survey and genomic sequencing data from seven clinical research sites from 2020 to 2022. Using input from multiple consortium working groups and from CSER leadership, we first identify 14 lessons learned from CSER in the categories of communication, harmonization, informatics, compliance, and analytics. We then distill these lessons learned into 11 recommendations for future research consortia in the areas of planning, communication, informatics, and analytics. We recommend that planning and budgeting for data coordination activities occur as early as possible during consortium conceptualization and development to minimize downstream complications. We also find that clear, reciprocal, and continuous communication between consortium stakeholders and the DCC is equally important to maintaining a secure and centralized informatics ecosystem for pooling data. Finally, we discuss the importance of actively interrogating current approaches to data governance, particularly for research studies that straddle the research-clinical divide

    A polygenic and phenotypic risk prediction for polycystic ovary syndrome evaluated by phenomewide association studies

    Get PDF
    Context: As many as 75% of patients with polycystic ovary syndrome (PCOS) are estimated tobe unidentified in clinical practice. Objective: Utilizing polygenic risk prediction, we aim to identify the phenome-widecomorbidity patterns characteristic of PCOS to improve accurate diagnosis and preventivetreatment.Design, Patients, and Methods: Leveraging the electronic health records (EHRs) of 124 852individuals, we developed a PCOS risk prediction algorithm by combining polygenic risk scores(PRS) with PCOS component phenotypes into a polygenic and phenotypic risk score (PPRS). Weevaluated its predictive capability across different ancestries and perform a PRS-based phenomewide association study (PheWAS) to assess the phenomic expression of the heightened risk ofPCOS.Results: The integrated polygenic prediction improved the average performance (pseudo-R2)for PCOS detection by 0.228 (61.5-fold), 0.224 (58.8-fold), 0.211 (57.0-fold) over the null modelacross European, African, and multi-ancestry participants respectively. The subsequent PRSpowered PheWAS identified a high level of shared biology between PCOS and a range ofmetabolic and endocrine outcomes, especially with obesity and diabetes: "morbid obesity","type 2 diabetes", "hypercholesterolemia", "disorders of lipid metabolism", "hypertension",and "sleep apnea" reaching phenome-wide significance.Conclusions: Our study has expanded the methodological utility of PRS in patient stratificationand risk prediction, especially in a multifactorial condition like PCOS, across different geneticorigins. By utilizing the individual genome-phenome data available from the EHR, our approachalso demonstrates that polygenic prediction by PRS can provide valuable opportunities todiscover the pleiotropic phenomic network associated with PCOS pathogenesis.Abbreviations: AA, African ancestry; ANOVA, analysis of variance; BMI, body mass index; EA,European ancestry; EHR, electronic health records; eMERGE, electronic Medical Records andGenomics Network; GWAS, genome-wide association study; IBD, identity-by-descent; ICDCM, International Classification of Diseases, Clinical Modification; LD, linkage disequilibrium;MA, multi-ancestry; MAF, minor allele frequency; NIH, National Institutes of Health; PCA,principal component analysis; PheWAS, phenome-wide association study; PCOS, polycysticovary syndrome; PPRS, polygenic and phenotypic risk score; PRS, polygenic risk sc

    Genome-wide meta-analysis uncovers novel loci influencing circulating leptin levels.

    Get PDF
    Leptin is an adipocyte-secreted hormone, the circulating levels of which correlate closely with overall adiposity. Although rare mutations in the leptin (LEP) gene are well known to cause leptin deficiency and severe obesity, no common loci regulating circulating leptin levels have been uncovered. Therefore, we performed a genome-wide association study (GWAS) of circulating leptin levels from 32,161 individuals and followed up loci reaching P<10(-6) in 19,979 additional individuals. We identify five loci robustly associated (P<5 × 10(-8)) with leptin levels in/near LEP, SLC32A1, GCKR, CCNL1 and FTO. Although the association of the FTO obesity locus with leptin levels is abolished by adjustment for BMI, associations of the four other loci are independent of adiposity. The GCKR locus was found associated with multiple metabolic traits in previous GWAS and the CCNL1 locus with birth weight. Knockdown experiments in mouse adipose tissue explants show convincing evidence for adipogenin, a regulator of adipocyte differentiation, as the novel causal gene in the SLC32A1 locus influencing leptin levels. Our findings provide novel insights into the regulation of leptin production by adipose tissue and open new avenues for examining the influence of variation in leptin levels on adiposity and metabolic health

    Combining Asian and European genome-wide association studies of colorectal cancer improves risk prediction across racial and ethnic populations

    Get PDF
    Polygenic risk scores (PRS) have great potential to guide precision colorectal cancer (CRC) prevention by identifying those at higher risk to undertake targeted screening. However, current PRS using European ancestry data have sub-optimal performance in non-European ancestry populations, limiting their utility among these populations. Towards addressing this deficiency, we expand PRS development for CRC by incorporating Asian ancestry data (21,731 cases; 47,444 controls) into European ancestry training datasets (78,473 cases; 107,143 controls). The AUC estimates (95% CI) of PRS are 0.63(0.62-0.64), 0.59(0.57-0.61), 0.62(0.60-0.63), and 0.65(0.63-0.66) in independent datasets including 1681-3651 cases and 8696-115,105 controls of Asian, Black/African American, Latinx/Hispanic, and non-Hispanic White, respectively. They are significantly better than the European-centric PRS in all four major US racial and ethnic groups (p-values < 0.05). Further inclusion of non-European ancestry populations, especially Black/African American and Latinx/Hispanic, is needed to improve the risk prediction and enhance equity in applying PRS in clinical practice

    Genome-wide association scan for childhood caries implicates novel genes

    No full text
    Dental caries is the most common chronic disease in children and a major public health concern due to its increasing incidence, serious health and social co-morbidities, and socio-demographic disparities in disease burden. We performed the first genome-wide association scan for dental caries to identify associated genetic loci and nominate candidate genes affecting tooth decay in 1305 US children ages 3-12 yrs. Affection status was defined as 1 or more primary teeth with evidence of decay based on intra-oral examination. No associations met strict criteria for genome-wide significance (p < 10E-7); however, several loci (ACTN2, MTR, and EDARADD, MPPED2, and LPO) with plausible biological roles in dental caries exhibited suggestive evidence for association. Analyses stratified by home fluoride level yielded additional suggestive loci, including TFIP11 in the low-fluoride group, and EPHA7 and ZMPSTE24 in the sufficient-fluoride group. Suggestive loci were tested but not significantly replicated in an independent sample (N = 1695, ages 2-7 yrs) after adjustment for multiple comparisons. This study reinforces the complexity of dental caries, suggesting that numerous loci, mostly having small effects, are involved in cariogenesis. Verification/replication of suggestive loci may highlight biological mechanisms and/or pathways leading to a fuller understanding of the genetic risks for dental caries. © 2011 International & American Associations for Dental Research

    Loss-of-function mutations in APOC3, triglycerides, and coronary disease.

    No full text
    BACKGROUND: Plasma triglyceride levels are heritable and are correlated with the risk of coronary heart disease. Sequencing of the protein-coding regions of the human genome (the exome) has the potential to identify rare mutations that have a large effect on phenotype. METHODS: We sequenced the protein-coding regions of 18,666 genes in each of 3734 participants of European or African ancestry in the Exome Sequencing Project. We conducted tests to determine whether rare mutations in coding sequence, individually or in aggregate within a gene, were associated with plasma triglyceride levels. For mutations associated with triglyceride levels, we subsequently evaluated their association with the risk of coronary heart disease in 110,970 persons. RESULTS: An aggregate of rare mutations in the gene encoding apolipoprotein C3 (APOC3) was associated with lower plasma triglyceride levels. Among the four mutations that drove this result, three were loss-of-function mutations: a nonsense mutation (R19X) and two splice-site mutations (IVS2+1G\u2192A and IVS3+1G\u2192T). The fourth was a missense mutation (A43T). Approximately 1 in 150 persons in the study was a heterozygous carrier of at least one of these four mutations. Triglyceride levels in the carriers were 39% lower than levels in noncarriers (P<1 710(-20)), and circulating levels of APOC3 in carriers were 46% lower than levels in noncarriers (P=8 710(-10)). The risk of coronary heart disease among 498 carriers of any rare APOC3 mutation was 40% lower than the risk among 110,472 noncarriers (odds ratio, 0.60; 95% confidence interval, 0.47 to 0.75; P=4 710(-6)). CONCLUSIONS: Rare mutations that disrupt APOC3 function were associated with lower levels of plasma triglycerides and APOC3. Carriers of these mutations were found to have a reduced risk of coronary heart disease

    Large-Scale Exome-wide Association Analysis Identifies Loci for White Blood Cell Traits and Pleiotropy with Immune-Mediated Diseases

    No full text
    White blood cells play diverse roles in innate and adaptive immunity. Genetic association analyses of phenotypic variation in circulating white blood cell (WBC) counts from large samples of otherwise healthy individuals can provide insights into genes and biologic pathways involved in production, differentiation, or clearance of particular WBC lineages (myeloid, lymphoid) and also potentially inform the genetic basis of autoimmune, allergic, and blood diseases. We performed an exome array-based meta-analysis of total WBC and subtype counts (neutrophils, monocytes, lymphocytes, basophils, and eosinophils) in a multi-ancestry discovery and replication sample of ∼157,622 individuals from 25 studies. We identified 16 common variants (8 of which were coding variants) associated with one or more WBC traits, the majority of which are pleiotropically associated with autoimmune diseases. Based on functional annotation, these loci included genes encoding surface markers of myeloid, lymphoid, or hematopoietic stem cell differentiation (CD69, CD33, CD87), transcription factors regulating lineage specification during hematopoiesis (ASXL1, IRF8, IKZF1, JMJD1C, ETS2-PSMG1), and molecules involved in neutrophil clearance/apoptosis (C10orf54, LTA), adhesion (TNXB), or centrosome and microtubule structure/function (KIF9, TUBD1). Together with recent reports of somatic ASXL1 mutations among individuals with idiopathic cytopenias or clonal hematopoiesis of undetermined significance, the identification of a common regulatory 3′ UTR variant of ASXL1 suggests that both germline and somatic ASXL1 mutations contribute to lower blood counts in otherwise asymptomatic individuals. These association results shed light on genetic mechanisms that regulate circulating WBC counts and suggest a prominent shared genetic architecture with inflammatory and autoimmune diseases

    Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol

    No full text
    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previous
    corecore