14 research outputs found
LD Hub:a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis
Motivation: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. Results: In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. Availability and implementation: The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/<br/
Association of Forced Vital Capacity with the Developmental Gene NCOR2.
BACKGROUND: Forced Vital Capacity (FVC) is an important predictor of all-cause mortality in the absence of chronic respiratory conditions. Epidemiological evidence highlights the role of early life factors on adult FVC, pointing to environmental exposures and genes affecting lung development as risk factors for low FVC later in life. Although highly heritable, a small number of genes have been found associated with FVC, and we aimed at identifying further genetic variants by focusing on lung development genes. METHODS: Per-allele effects of 24,728 SNPs in 403 genes involved in lung development were tested in 7,749 adults from three studies (NFBC1966, ECRHS, EGEA). The most significant SNP for the top 25 genes was followed-up in 46,103 adults (CHARGE and SpiroMeta consortia) and 5,062 children (ALSPAC). Associations were considered replicated if the replication p-value survived Bonferroni correction (p<0.002; 0.05/25), with a nominal p-value considered as suggestive evidence. For SNPs with evidence of replication, effects on the expression levels of nearby genes in lung tissue were tested in 1,111 lung samples (Lung eQTL consortium), with further functional investigation performed using public epigenomic profiling data (ENCODE). RESULTS: NCOR2-rs12708369 showed strong replication in children (p = 0.0002), with replication unavailable in adults due to low imputation quality. This intronic variant is in a strong transcriptional enhancer element in lung fibroblasts, but its eQTL effects could not be tested due to low imputation quality in the eQTL dataset. SERPINE2-rs6754561 replicated at nominal level in both adults (p = 0.036) and children (p = 0.045), while WNT16-rs2707469 replicated at nominal level only in adults (p = 0.026). The eQTL analyses showed association of WNT16-rs2707469 with expression levels of the nearby gene CPED1. We found no statistically significant eQTL effects for SERPINE2-rs6754561. CONCLUSIONS: We have identified a new gene, NCOR2, in the retinoic acid signalling pathway pointing to a role of vitamin A metabolism in the regulation of FVC. Our findings also support SERPINE2, a COPD gene with weak previous evidence of association with FVC, and suggest WNT16 as a further promising candidate
HAPRAP: a haplotype-based iterative method for statistical fine mapping using GWAS summary statistics.
MOTIVATION: Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients (r(2)) of the variants. However, haplotypes rather than pairwise r(2), are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this paper, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. RESULTS: Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits (GIANT) height data, HAPRAP performs well with a small training sample size (N<2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by SNPs with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). AVAILABILITY: The HAPRAP package and documentation are available online: http://apps.biocompute.org.uk/haprap
LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis.
MOTIVATION: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. RESULTS: In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. AVAILABILITY AND IMPLEMENTATION: The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: [email protected] SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online
Integrative pathway genomics of lung function and airflow obstruction
Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signalling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analysed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10’s role in influencing lung’s susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unravelled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease
Cardiometabolic effects of genetic upregulation of the interleukin 1 receptor antagonist: a Mendelian randomisation analysis
Background:
To investigate potential cardiovascular and other effects of long-term pharmacological interleukin 1 (IL-1) inhibition, we studied genetic variants that produce inhibition of IL-1, a master regulator of inflammation.
Methods:
We created a genetic score combining the effects of alleles of two common variants (rs6743376 and rs1542176) that are located upstream of IL1RN, the gene encoding the IL-1 receptor antagonist (IL-1Ra; an endogenous inhibitor of both IL-1α and IL-1β); both alleles increase soluble IL-1Ra protein concentration. We compared effects on inflammation biomarkers of this genetic score with those of anakinra, the recombinant form of IL-1Ra, which has previously been studied in randomised trials of rheumatoid arthritis and other inflammatory disorders. In primary analyses, we investigated the score in relation to rheumatoid arthritis and four cardiometabolic diseases (type 2 diabetes, coronary heart disease, ischaemic stroke, and abdominal aortic aneurysm; 453 411 total participants). In exploratory analyses, we studied the relation of the score to many disease traits and to 24 other disorders of proposed relevance to IL-1 signalling (746 171 total participants).
Findings:
For each IL1RN minor allele inherited, serum concentrations of IL-1Ra increased by 0·22 SD (95% CI 0·18–0·25; 12·5%; p=9·3 × 10−33), concentrations of interleukin 6 decreased by 0·02 SD (−0·04 to −0·01; −1·7%; p=3·5 × 10−3), and concentrations of C-reactive protein decreased by 0·03 SD (−0·04 to −0·02; −3·4%; p=7·7 × 10−14). We noted the effects of the genetic score on these inflammation biomarkers to be directionally concordant with those of anakinra. The allele count of the genetic score had roughly log-linear, dose-dependent associations with both IL-1Ra concentration and risk of coronary heart disease. For people who carried four IL-1Ra-raising alleles, the odds ratio for coronary heart disease was 1·15 (1·08–1·22; p=1·8 × 10−6) compared with people who carried no IL-1Ra-raising alleles; the per-allele odds ratio for coronary heart disease was 1·03 (1·02–1·04; p=3·9 × 10−10). Per-allele odds ratios were 0·97 (0·95–0·99; p=9·9 × 10−4) for rheumatoid arthritis, 0·99 (0·97–1·01; p=0·47) for type 2 diabetes, 1·00 (0·98–1·02; p=0·92) for ischaemic stroke, and 1·08 (1·04–1·12; p=1·8 × 10−5) for abdominal aortic aneurysm. In exploratory analyses, we observed per-allele increases in concentrations of proatherogenic lipids, including LDL-cholesterol, but no clear evidence of association for blood pressure, glycaemic traits, or any of the 24 other disorders studied. Modelling suggested that the observed increase in LDL-cholesterol could account for about a third of the association observed between the genetic score and increased coronary risk.
Interpretation:
Human genetic data suggest that long-term dual IL-1α/β inhibition could increase cardiovascular risk and, conversely, reduce the risk of development of rheumatoid arthritis. The cardiovascular risk might, in part, be mediated through an increase in proatherogenic lipid concentrations.
Funding:
UK Medical Research Council, British Heart Foundation, UK National Institute for Health Research, National Institute for Health Research Cambridge Biomedical Research Centre, European Research Council, and European Commission Framework Programme 7
Genome-wide association analyses for lung function and chronic obstructive pulmonary disease identify new loci and potential druggable targets
Chronic Obstructive Pulmonary Disease (COPD) is characterised by reduced lung function and is the third leading cause of death globally. Through genome-wide association discovery in 48,943 individuals, selected from extremes of the lung function distribution in UK Biobank, and follow-up in 95,375 individuals, we increased the yield of independent signals for lung function from 54 to 97. A genetic risk score was associated with COPD susceptibility (odds ratios per standard deviation of the risk score (~6 alleles) (95% confidence interval) 1.24 (1.20-1.27), P=5.05x10^-49) and we observed a 3.7 fold difference in COPD risk between highest and lowest genetic risk score deciles in UK Biobank. The 97 signals show enrichment in development, elastic fibres and epigenetic regulation pathways. We highlight targets for drugs and compounds in development for COPD and asthma (genes in the inositol phosphate metabolism pathway and CHRM3) and describe targets for potential drug repositioning from other clinical indications
A novel common variant in DCST2 is associated with length in early life and height in adulthood
Common genetic variants have been identified for adult height, but not much is known about the genetics of skeletal growth in early life. To identify common genetic variants that influence fetal skeletal growth, we meta-analyzed 22 genome-wide association studies (Stage 1; N = 28 459). We identified seven independent top single nucleotide polymorphisms (SNPs) (P < 1 × 10?6) for birth length, of which three were novel and four were in or near loci known to be associated with adult height (LCORL, PTCH1, GPR126 and HMGA2). The three novel SNPs were followed-up in nine replication studies (Stage 2; N = 11 995), with rs905938 in DC-STAMP domain containing 2 (DCST2) genome-wide significantly associated with birth length in a joint analysis (Stages 1 + 2; ? = 0.046, SE = 0.008, P = 2.46 × 10?8, explained variance = 0.05%). Rs905938 was also associated with infant length (N = 28 228; P = 5.54 × 10?4) and adult height (N = 127 513; P = 1.45 × 10?5). DCST2 is a DC-STAMP-like protein family member and DC-STAMP is an osteoclast cell-fusion regulator. Polygenic scores based on 180 SNPs previously associated with human adult stature explained 0.13% of variance in birth length. The same SNPs explained 2.95% of the variance of infant length. Of the 180 known adult height loci, 11 were genome-wide significantly associated with infant length (SF3B4, LCORL, SPAG17, C6orf173, PTCH1, GDF5, ZNFX1, HHIP, ACAN, HLA locus and HMGA2). This study highlights that common variation in DCST2 influences variation in early growth and adult height.</p
