15 research outputs found
LD Hub:a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis
Motivation: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. Results: In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies
Prioritizing disease and trait causal variants at the TNFAIP3 locus using functional and genomic features
Genome-wide association studies have associated thousands of genetic variants with complex traits and diseases, but pinpointing the causal variant(s) among those in tight linkage disequilibrium with each associated variant remains a major challenge. Here, we use seven experimental assays to characterize all common variants at the multiple disease-associated TNFAIP3 locus in five disease-relevant immune cell lines, based on a set of features related to regulatory potential. Trait/disease-associated variants are enriched among SNPs prioritized based on either: (1) residing within CRISPRi-sensitive regulatory regions, or (2) localizing in a chromatin accessible region while displaying allele-specific reporter activity. Of the 15 trait/disease-associated haplotypes at TNFAIP3, 9 have at least one variant meeting one or both of these criteria, 5 of which are further supported by genetic fine-mapping. Our work provides a comprehensive strategy to characterize genetic variation at important disease-associated loci, and aids in the effort to identify trait causal genetic variants
Identification of common genetic risk variants for autism spectrum disorder
Autism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample-size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 individuals with ASD and 27,969 controls that identified five genome-wide-significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), we identified seven additional loci shared with other traits at equally strict significance levels. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across ASD subtypes. These results highlight biological insights, particularly relating to neuronal function and corticogenesis, and establish that GWAS performed at scale will be much more productive in the near term in ASD.Peer reviewe
Genome-wide by Environment Interaction Studies of Depressive Symptoms and Psychosocial Stress in UK Biobank and Generation Scotland
Stress is associated with poorer physical and mental health. To improve our understanding of this link, we performed genome-wide association studies (GWAS) of depressive symptoms and genome-wide by environment interaction studies (GWEIS) of depressive symptoms and stressful life events (SLE) in two UK population-based cohorts (Generation Scotland and UK Biobank). No SNP was individually significant in either GWAS, but gene-based tests identified six genes associated with depressive symptoms in UK Biobank (DCC, ACSS3, DRD2, STAG1, FOXP2 and KYNU; p < 2.77 x 10(-6)). Two SNPs with genome-wide significant GxE effects were identified by GWEIS in Generation Scotland: rs12789145 (53-kb downstream PIWIL4; p = 4.95 x 10(-9); total SLE) and rs17070072 (intronic to ZCCHC2; p = 1.46 x 10(-8); dependent SLE). A third locus upstream CYLC2 (rs12000047 and rs12005200, p < 2.00 x 10(-8); dependent SLE) when the joint effect of the SNP main and GxE effects was considered. GWEIS gene-based tests identified: MTNR1B with GxE effect with dependent SLE in Generation Scotland; and PHF2 with the joint effect in UK Biobank (p < 2.77 x 10(-6)). Polygenic risk scores (PRSs) analyses incorporating GxE effects improved the prediction of depressive symptom scores, when using weights derived from either the UK Biobank GWAS of depressive symptoms (p = 0.01) or the PGC GWAS of major depressive disorder (p = 5.91 x 10(-3)). Using an independent sample, PRS derived using GWEIS GxE effects provided evidence of shared aetiologies between depressive symptoms and schizotypal personality, heart disease and COPD. Further such studies are required and may result in improved treatments for depression and other stress-related conditions
Analysis of shared heritability in common disorders of the brain
Paroxysmal Cerebral Disorder
Integrated analysis of environmental and genetic influences on cord blood DNA methylation in new-borns
Epigenetic processes, including DNA methylation (DNAm), are among the mechanisms allowing integration of genetic and environmental factors to shape cellular function. While many studies have investigated either environmental or genetic contributions to DNAm, few have assessed their integrated effects. Here we examine the relative contributions of prenatal environmental factors and genotype on DNA methylation in neonatal blood at variably methylated regions (VMRs) in 4 independent cohorts (overall n = 2365). We use Akaike’s information criterion to test which factors best explain variability of methylation in the cohort-specific VMRs: several prenatal environmental factors (E), genotypes in cis (G), or their additive (G + E) or interaction (GxE) effects. Genetic and environmental factors in combination best explain DNAm at the majority of VMRs. The CpGs best explained by either G, G + E or GxE are functionally distinct. The enrichment of genetic variants from GxE models in GWAS for complex disorders supports their importance for disease risk
Analysis of shared heritability in common disorders of the brain
Disorders of the brain can exhibit considerable epidemiological comorbidity and often share symptoms, provoking debate about their etiologic overlap. We quantified the genetic sharing of 25 brain disorders from genome-wide association studies of 265,218 patients and 784,643 control participants and assessed their relationship to 17 phenotypes from 1,191,588 individuals. Psychiatric disorders share common variant risk, whereas neurological disorders appear more distinct from one another and from the psychiatric disorders. We also identified significant sharing between disorders and a number of brain phenotypes, including cognitive measures. Further, we conducted simulations to explore how statistical power, diagnostic misclassification, and phenotypic heterogeneity affect genetic correlations. These results highlight the importance of common genetic variation as a risk factor for brain disorders and the value of heritability-based methods in understanding their etiology
Interrogation of human hematopoiesis at single-cell and single-variant resolution
Widespread linkage disequilibrium and incomplete annotation of cell-to-cell state variation represent substantial challenges to elucidating mechanisms of trait-associated genetic variation. Here we perform genetic fine-mapping for blood cell traits in the UK Biobank to identify putative causal variants. These variants are enriched in genes encoding proteins in trait-relevant biological pathways and in accessible chromatin of hematopoietic progenitors. For regulatory variants, we explore patterns of developmental enhancer activity, predict molecular mechanisms, and identify likely target genes. In several instances, we localize multiple independent variants to the same regulatory element or gene. We further observe that variants with pleiotropic effects preferentially act in common progenitor populations to direct the production of distinct lineages. Finally, we leverage fine-mapped variants in conjunction with continuous epigenomic annotations to identify trait-cell type enrichments within closely related populations and in single cells. Our study provides a comprehensive framework for single-variant and single-cell analyses of genetic associations
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.
Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size