183 research outputs found

    Performance of random forest when SNPs are in linkage disequilibrium

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Single nucleotide polymorphisms (SNPs) may be correlated due to linkage disequilibrium (LD). Association studies look for both direct and indirect associations with disease loci. In a Random Forest (RF) analysis, correlation between a true risk SNP and SNPs in LD may lead to diminished variable importance for the true risk SNP. One approach to address this problem is to select SNPs in linkage equilibrium (LE) for analysis. Here, we explore alternative methods for dealing with SNPs in LD: change the tree-building algorithm by building each tree in an RF only with SNPs in LE, modify the importance measure (IM), and use haplotypes instead of SNPs to build a RF.</p> <p>Results</p> <p>We evaluated the performance of our alternative methods by simulation of a spectrum of complex genetics models. When a haplotype rather than an individual SNP is the risk factor, we find that the original Random Forest method performed on SNPs provides good performance. When individual, genotyped SNPs are the risk factors, we find that the stronger the genetic effect, the stronger the effect LD has on the performance of the original RF. A revised importance measure used with the original RF is relatively robust to LD among SNPs; this revised importance measure used with the revised RF is sometimes inflated. Overall, we find that the revised importance measure used with the original RF is the best choice when the genetic model and the number of SNPs in LD with risk SNPs are unknown. For the haplotype-based method, under a multiplicative heterogeneity model, we observed a decrease in the performance of RF with increasing LD among the SNPs in the haplotype.</p> <p>Conclusion</p> <p>Our results suggest that by strategically revising the Random Forest method tree-building or importance measure calculation, power can increase when LD exists between SNPs. We conclude that the revised Random Forest method performed on SNPs offers an advantage of not requiring genotype phase, making it a viable tool for use in the context of thousands of SNPs, such as candidate gene studies and follow-up of top candidates from genome wide association studies.</p

    Genome-wide meta-analysis of muscle weakness identifies 15 susceptibility loci in older men and women

    Get PDF
    © 2021, The Author(s). Low muscle strength is an important heritable indicator of poor health linked to morbidity and mortality in older people. In a genome-wide association study meta-analysis of 256, 523 Europeans aged 60 years and over from 22 cohorts we identify 15 loci associated with muscle weakness (European Working Group on Sarcopenia in Older People definition: n = 48,596 cases, 18.9% of total), including 12 loci not implicated in previous analyses of continuous measures of grip strength. Loci include genes reportedly involved in autoimmune disease (HLA-DQA1p = 4 × 10−17), arthritis (GDF5p = 4 × 10−13), cell cycle control and cancer protection, regulation of transcription, and others involved in the development and maintenance of the musculoskeletal system. Using Mendelian randomization we report possible overlapping causal pathways, including diabetes susceptibility, haematological parameters, and the immune system. We conclude that muscle weakness in older adults has distinct mechanisms from continuous strength, including several pathways considered to be hallmarks of ageing

    Eight common genetic variants associated with serum dheas levels suggest a key role in ageing mechanisms

    Get PDF
    Dehydroepiandrosterone sulphate (DHEAS) is the most abundant circulating steroid secreted by adrenal glands-yet its function is unknown. Its serum concentration declines significantly with increasing age, which has led to speculation that a relative DHEAS deficiency may contribute to the development of common age-related diseases or diminished longevity. We conducted a meta-analysis of genome-wide association data with 14,846 individuals and identified eight independent common SNPs associated with serum DHEAS concentrations. Genes at or near the identified loci include ZKSCAN5 (rs11761528; p = 3.15×10-36), SULT2A1 (rs2637125; p = 2.61×10-19), ARPC1A (rs740160; p = 1.56×10-16), TRIM4 (rs17277546; p = 4.50×10-11), BMF (rs7181230; p = 5.44×10-11), HHEX (rs2497306; p = 4.64×10-9), BCL2L11 (rs6738028; p = 1.72×10-8), and CYP2C9 (rs2185570; p = 2.29×10-8). These genes are associated with type 2 diabetes, lymphoma, actin filament assembly, drug and xenobiotic metabolism, and zinc finger proteins. Several SNPs were associated with changes in gene expression levels, and the related genes are connected to biological pathways linking DHEAS with ageing. This study provides much needed insight into the function of DHEAS

    Two novel loci, COBL and SLC10A2, for Alzheimer's disease in African Americans

    Get PDF
    INTRODUCTION: African Americans' (AAs) late-onset Alzheimer's disease (LOAD) genetic risk profile is incompletely understood. Including clinical covariates in genetic analyses using informed conditioning might improve study power. METHODS: We conducted a genome-wide association study (GWAS) in AAs employing informed conditioning in 1825 LOAD cases and 3784 cognitively normal controls. We derived a posterior liability conditioned on age, sex, diabetes status, current smoking status, educational attainment, and affection status, with parameters informed by external prevalence information. We assessed association between the posterior liability and a genome-wide set of single-nucleotide polymorphisms (SNPs), controlling for APOE and ABCA7, identified previously in a LOAD GWAS of AAs. RESULTS: Two SNPs at novel loci, rs112404845 (P = 3.8 × 10-8), upstream of COBL, and rs16961023 (P = 4.6 × 10-8), downstream of SLC10A2, obtained genome-wide significant evidence of association with the posterior liability. DISCUSSION: An informed conditioning approach can detect LOAD genetic associations in AAs not identified by traditional GWAS

    Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche.

    Get PDF
    Age at menarche is a marker of timing of puberty in females. It varies widely between individuals, is a heritable trait and is associated with risks for obesity, type 2 diabetes, cardiovascular disease, breast cancer and all-cause mortality. Studies of rare human disorders of puberty and animal models point to a complex hypothalamic-pituitary-hormonal regulation, but the mechanisms that determine pubertal timing and underlie its links to disease risk remain unclear. Here, using genome-wide and custom-genotyping arrays in up to 182,416 women of European descent from 57 studies, we found robust evidence (P < 5 × 10(-8)) for 123 signals at 106 genomic loci associated with age at menarche. Many loci were associated with other pubertal traits in both sexes, and there was substantial overlap with genes implicated in body mass index and various diseases, including rare disorders of puberty. Menarche signals were enriched in imprinted regions, with three loci (DLK1-WDR25, MKRN3-MAGEL2 and KCNK9) demonstrating parent-of-origin-specific associations concordant with known parental expression patterns. Pathway analyses implicated nuclear hormone receptors, particularly retinoic acid and γ-aminobutyric acid-B2 receptor signalling, among novel mechanisms that regulate pubertal timing in humans. Our findings suggest a genetic architecture involving at least hundreds of common variants in the coordinated timing of the pubertal transition

    GWAS of epigenetic aging rates in blood reveals a critical role for TERT.

    Get PDF
    DNA methylation age is an accurate biomarker of chronological age and predicts lifespan, but its underlying molecular mechanisms are unknown. In this genome-wide association study of 9907 individuals, we find gene variants mapping to five loci associated with intrinsic epigenetic age acceleration (IEAA) and gene variants in three loci associated with extrinsic epigenetic age acceleration (EEAA). Mendelian randomization analysis suggests causal influences of menarche and menopause on IEAA and lipoproteins on IEAA and EEAA. Variants associated with longer leukocyte telomere length (LTL) in the telomerase reverse transcriptase gene (TERT) paradoxically confer higher IEAA (P < 2.7 × 10-11). Causal modeling indicates TERT-specific and independent effects on LTL and IEAA. Experimental hTERT-expression in primary human fibroblasts engenders a linear increase in DNA methylation age with cell population doubling number. Together, these findings indicate a critical role for hTERT in regulating the epigenetic clock, in addition to its established role of compensating for cell replication-dependent telomere shortening

    Genetic Determinants of Circulating Estrogen Levels and Evidence of a Causal Effect of Estradiol on Bone Density in Men.

    Get PDF
    CONTEXT: Serum estradiol (E2) and estrone (E1) levels exhibit substantial heritability. OBJECTIVE: To investigate the genetic regulation of serum E2 and E1 in men. DESIGN, SETTING, AND PARTICIPANTS: Genome-wide association study in 11,097 men of European origin from nine epidemiological cohorts. MAIN OUTCOME MEASURES: Genetic determinants of serum E2 and E1 levels. RESULTS: Variants in/near CYP19A1 demonstrated the strongest evidence for association with E2, resolving to three independent signals. Two additional independent signals were found on the X chromosome; FAMily with sequence similarity 9, member B (FAM9B), rs5934505 (P = 3.4 × 10-8) and Xq27.3, rs5951794 (P = 3.1 × 10-10). E1 signals were found in CYP19A1 (rs2899472, P = 5.5 × 10-23), in Tripartite motif containing 4 (TRIM4; rs17277546, P = 5.8 × 10-14), and CYP11B1/B2 (rs10093796, P = 1.2 × 10-8). E2 signals in CYP19A1 and FAM9B were associated with bone mineral density (BMD). Mendelian randomization analysis suggested a causal effect of serum E2 on BMD in men. A 1 pg/mL genetically increased E2 was associated with a 0.048 standard deviation increase in lumbar spine BMD (P = 2.8 × 10-12). In men and women combined, CYP19A1 alleles associated with higher E2 levels were associated with lower degrees of insulin resistance. CONCLUSIONS: Our findings confirm that CYP19A1 is an important genetic regulator of E2 and E1 levels and strengthen the causal importance of E2 for bone health in men. We also report two independent loci on the X-chromosome for E2, and one locus each in TRIM4 and CYP11B1/B2, for E1

    Association of Long Runs of Homozygosity With Alzheimer Disease Among African American Individuals

    Get PDF
    IMPORTANCE: Mutations in known causal Alzheimer disease (AD) genes account for only 1% to 3% of patients and almost all are dominantly inherited. Recessive inheritance of complex phenotypes can be linked to long (>1-megabase [Mb]) runs of homozygosity (ROHs) detectable by single-nucleotide polymorphism (SNP) arrays. OBJECTIVE: To evaluate the association between ROHs and AD in an African American population known to have a risk for AD up to 3 times higher than white individuals. DESIGN, SETTING, AND PARTICIPANTS: Case-control study of a large African American data set previously genotyped on different genome-wide SNP arrays conducted from December 2013 to January 2015. Global and locus-based ROH measurements were analyzed using raw or imputed genotype data. We studied the raw genotypes from 2 case-control subsets grouped based on SNP array: Alzheimer's Disease Genetics Consortium data set (871 cases and 1620 control individuals) and Chicago Health and Aging Project-Indianapolis Ibadan Dementia Study data set (279 cases and 1367 control individuals). We then examined the entire data set using imputed genotypes from 1917 cases and 3858 control individuals. MAIN OUTCOMES AND MEASURES: The ROHs larger than 1 Mb, 2 Mb, or 3 Mb were investigated separately for global burden evaluation, consensus regions, and gene-based analyses. RESULTS: The African American cohort had a low degree of inbreeding (F ~ 0.006). In the Alzheimer's Disease Genetics Consortium data set, we detected a significantly higher proportion of cases with ROHs greater than 2 Mb (P = .004) or greater than 3 Mb (P = .02), as well as a significant 114-kilobase consensus region on chr4q31.3 (empirical P value 2 = .04; ROHs >2 Mb). In the Chicago Health and Aging Project-Indianapolis Ibadan Dementia Study data set, we identified a significant 202-kilobase consensus region on Chr15q24.1 (empirical P value 2 = .02; ROHs >1 Mb) and a cluster of 13 significant genes on Chr3p21.31 (empirical P value 2 = .03; ROHs >3 Mb). A total of 43 of 49 nominally significant genes common for both data sets also mapped to Chr3p21.31. Analyses of imputed SNP data from the entire data set confirmed the association of AD with global ROH measurements (12.38 ROHs >1 Mb in cases vs 12.11 in controls; 2.986 Mb average size of ROHs >2 Mb in cases vs 2.889 Mb in controls; and 22% of cases with ROHs >3 Mb vs 19% of controls) and a gene-cluster on Chr3p21.31 (empirical P value 2 = .006-.04; ROHs >3 Mb). Also, we detected a significant association between AD and CLDN17 (empirical P value 2 = .01; ROHs >1 Mb), encoding a protein from the Claudin family, members of which were previously suggested as AD biomarkers. CONCLUSIONS AND RELEVANCE: To our knowledge, we discovered the first evidence of increased burden of ROHs among patients with AD from an outbred African American population, which could reflect either the cumulative effect of multiple ROHs to AD or the contribution of specific loci harboring recessive mutations and risk haplotypes in a subset of patients. Sequencing is required to uncover AD variants in these individuals

    A genome-wide association study of aging

    Get PDF
    AbstractHuman longevity and healthy aging show moderate heritability (20%–50%). We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death. No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p < 5 × 10−8). We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p < 10−5). These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease. In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings. These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity
    corecore