18,569 research outputs found

    SNP-based pathway enrichment analysis for genome-wide association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recently we have witnessed a surge of interest in using genome-wide association studies (GWAS) to discover the genetic basis of complex diseases. Many genetic variations, mostly in the form of single nucleotide polymorphisms (SNPs), have been identified in a wide spectrum of diseases, including diabetes, cancer, and psychiatric diseases. A common theme arising from these studies is that the genetic variations discovered by GWAS can only explain a small fraction of the genetic risks associated with the complex diseases. New strategies and statistical approaches are needed to address this lack of explanation. One such approach is the pathway analysis, which considers the genetic variations underlying a biological pathway, rather than separately as in the traditional GWAS studies. A critical challenge in the pathway analysis is how to combine evidences of association over multiple SNPs within a gene and multiple genes within a pathway. Most current methods choose the most significant SNP from each gene as a representative, ignoring the joint action of multiple SNPs within a gene. This approach leads to preferential identification of genes with a greater number of SNPs.</p> <p>Results</p> <p>We describe a SNP-based pathway enrichment method for GWAS studies. The method consists of the following two main steps: 1) for a given pathway, using an adaptive truncated product statistic to identify all representative (potentially more than one) SNPs of each gene, calculating the average number of representative SNPs for the genes, then re-selecting the representative SNPs of genes in the pathway based on this number; and 2) ranking all selected SNPs by the significance of their statistical association with a trait of interest, and testing if the set of SNPs from a particular pathway is significantly enriched with high ranks using a weighted Kolmogorov-Smirnov test. We applied our method to two large genetically distinct GWAS data sets of schizophrenia, one from European-American (EA) and the other from African-American (AA). In the EA data set, we found 22 pathways with nominal P-value less than or equal to 0.001 and corresponding false discovery rate (FDR) less than 5%. In the AA data set, we found 11 pathways by controlling the same nominal P-value and FDR threshold. Interestingly, 8 of these pathways overlap with those found in the EA sample. We have implemented our method in a JAVA software package, called <it>SNP Set Enrichment Analysis </it>(SSEA), which contains a user-friendly interface and is freely available at <url>http://cbcl.ics.uci.edu/SSEA.</url></p> <p>Conclusions</p> <p>The SNP-based pathway enrichment method described here offers a new alternative approach for analysing GWAS data. By applying it to schizophrenia GWAS studies, we show that our method is able to identify statistically significant pathways, and importantly, pathways that can be replicated in large genetically distinct samples.</p

    Genome-Wide Interaction and Pathway Association Studies for Body Mass Index

    Get PDF
    Objective: We investigated gene interactions (epistasis) for body mass index (BMI) in a European-American adult female cohort via genome-wide interaction analyses (GWIA) and pathway association analyses.Methods: Genome-wide pairwise interaction analyses were carried out for BMI in 493 extremely obese cases (BMI &gt; 35 kg/m2) and 537 never-overweight controls (BMI &lt; 25 kg/m2). To further validate the results, specific SNPs were selected based on the GWIA results for haplotype-based association studies. Pathway-based association analyses were performed using a modified Gene Set Enrichment Algorithm (GSEA) (GenGen program) to further explore BMI-related pathways using our genome wide association study (GWAS) data set, GIANT, ENGAGE, and DIAGRAM Consortia.Results: The EXOC4-1q23.1 interaction was associated with BMI, with the most significant epistasis between rs7800006 and rs10797020 (P = 2.63 × 10-11). In the pathway-based association analysis, Tob1 pathway showed the most significant association with BMI (empirical P &lt; 0.001, FDR = 0.044, FWER = 0.040). These findings were further validated in different populations.Conclusion: Genome-wide pairwise SNP-SNP interaction and pathway analyses suggest that EXOC4 and TOB1-related pathways may contribute to the development of obesity

    Efficient pathway enrichment and network analysis of GWAS summary data using GSA-SNP2

    Get PDF
    Pathway-based analysis in genome-wide association study (GWAS) is being widely used to uncover novel multi-genic functional associations. Many of these pathway-based methods have been used to test the enrichment of the associated genes in the pathways, but exhibited low powers and were highly affected by free parameters. We present the novel method and software GSA-SNP2 for pathway enrichment analysis of GWAS P-value data. GSA-SNP2 provides high power, decent type I error control and fast computation by incorporating the random set model and SNP-count adjusted gene score. In a comparative study using simulated and real GWAS data, GSA-SNP2 exhibited high power and best prioritized gold standard positive pathways compared with six existing enrichment-based methods and two self-contained methods (alternative pathway analysis approach). Based on these results, the difference between pathway analysis approaches was investigated and the effects of the gene correlation structures on the pathway enrichment analysis were also discussed. In addition, GSA-SNP2 is able to visualize protein interaction networks within and across the significant pathways so that the user can prioritize the core subnetworks for further studies. GSA-SNP2 is freely available at https://sourceforge.net/projects/gsasnp2

    Genetic associations with childhood brain growth, defined in two longitudinal cohorts

    Get PDF
    Genome-wide association studies (GWASs) are unraveling the genetics of adult brain neuroanatomy as measured by cross-sectional anatomic magnetic resonance imaging (aMRI). However, the genetic mechanisms that shape childhood brain development are, as yet, largely unexplored. In this study we identify common genetic variants associated with childhood brain development as defined by longitudinal aMRI. Genome-wide single nucleotide polymorphism (SNP) data were determined in two cohorts: one enriched for attention-deficit/hyperactivity disorder (ADHD) (LONG cohort: 458 participants; 119 with ADHD) and the other from a population-based cohort (Generation R: 257 participants). The growth of the brain's major regions (cerebral cortex, white matter, basal ganglia, and cerebellum) and one region of interest (the right lateral prefrontal cortex) were defined on all individuals from two aMRIs, and a GWAS and a pathway analysis were performed. In addition, association between polygenic risk for ADHD and brain growth was determined for the LONG cohort. For white matter growth, GWAS meta-analysis identified a genome-wide significant intergenic SNP (rs12386571, P = 9.09 × 10-9 ), near AKR1B10. This gene is part of the aldo-keto reductase superfamily and shows neural expression. No enrichment of neural pathways was detected and polygenic risk for ADHD was not associated with the brain growth phenotypes in the LONG cohort that was enriched for the diagnosis of ADHD. The study illustrates the use of a novel brain growth phenotype defined in vivo for further study

    Association Signals Unveiled by a Comprehensive Gene Set Enrichment Analysis of Dental Caries Genome-Wide Association Studies

    Get PDF
    Gene set-based analysis of genome-wide association study (GWAS) data has recently emerged as a useful approach to examine the joint effects of multiple risk loci in complex human diseases or phenotypes. Dental caries is a common, chronic, and complex disease leading to a decrease in quality of life worldwide. In this study, we applied the approaches of gene set enrichment analysis to a major dental caries GWAS dataset, which consists of 537 cases and 605 controls. Using four complementary gene set analysis methods, we analyzed 1331 Gene Ontology (GO) terms collected from the Molecular Signatures Database (MSigDB). Setting false discovery rate (FDR) threshold as 0.05, we identified 13 significantly associated GO terms. Additionally, 17 terms were further included as marginally associated because they were top ranked by each method, although their FDR is higher than 0.05. In total, we identified 30 promising GO terms, including 'Sphingoid metabolic process,' 'Ubiquitin protein ligase activity,' 'Regulation of cytokine secretion,' and 'Ceramide metabolic process.' These GO terms encompass broad functions that potentially interact and contribute to the oral immune response related to caries development, which have not been reported in the standard single marker based analysis. Collectively, our gene set enrichment analysis provided complementary insights into the molecular mechanisms and polygenic interactions in dental caries, revealing promising association signals that could not be detected through single marker analysis of GWAS data. © 2013 Wang et al

    Fifteen new risk loci for coronary artery disease highlight arterial-wall-specific mechanisms

    Get PDF
    Coronary artery disease (CAD) is a leading cause of morbidity and mortality worldwide. Although 58 genomic regions have been associated with CAD thus far, most of the heritability is unexplained, indicating that additional susceptibility loci await identification. An efficient discovery strategy may be larger-scale evaluation of promising associations suggested by genome-wide association studies (GWAS). Hence, we genotyped 56,309 participants using a targeted gene array derived from earlier GWAS results and performed meta-analysis of results with 194,427 participants previously genotyped, totaling 88,192 CAD cases and 162,544 controls. We identified 25 new SNP-CAD associations (P &lt; 5 × 10(-8), in fixed-effects meta-analysis) from 15 genomic regions, including SNPs in or near genes involved in cellular adhesion, leukocyte migration and atherosclerosis (PECAM1, rs1867624), coagulation and inflammation (PROCR, rs867186 (p.Ser219Gly)) and vascular smooth muscle cell differentiation (LMOD1, rs2820315). Correlation of these regions with cell-type-specific gene expression and plasma protein levels sheds light on potential disease mechanisms

    Whole-genome association analysis of treatment response in obsessive-compulsive disorder.

    Get PDF
    Up to 30% of patients with obsessive-compulsive disorder (OCD) exhibit an inadequate response to serotonin reuptake inhibitors (SRIs). To date, genetic predictors of OCD treatment response have not been systematically investigated using genome-wide association study (GWAS). To identify specific genetic variations potentially influencing SRI response, we conducted a GWAS study in 804 OCD patients with information on SRI response. SRI response was classified as 'response' (n=514) or 'non-response' (n=290), based on self-report. We used the more powerful Quasi-Likelihood Score Test (the MQLS test) to conduct a genome-wide association test correcting for relatedness, and then used an adjusted logistic model to evaluate the effect size of the variants in probands. The top single-nucleotide polymorphism (SNP) was rs17162912 (P=1.76 × 10(-8)), which is near the DISP1 gene on 1q41-q42, a microdeletion region implicated in neurological development. The other six SNPs showing suggestive evidence of association (P&lt;10(-5)) were rs9303380, rs12437601, rs16988159, rs7676822, rs1911877 and rs723815. Among them, two SNPs in strong linkage disequilibrium, rs7676822 and rs1911877, located near the PCDH10 gene, gave P-values of 2.86 × 10(-6) and 8.41 × 10(-6), respectively. The other 35 variations with signals of potential significance (P&lt;10(-4)) involve multiple genes expressed in the brain, including GRIN2B, PCDH10 and GPC6. Our enrichment analysis indicated suggestive roles of genes in the glutamatergic neurotransmission system (false discovery rate (FDR)=0.0097) and the serotonergic system (FDR=0.0213). Although the results presented may provide new insights into genetic mechanisms underlying treatment response in OCD, studies with larger sample sizes and detailed information on drug dosage and treatment duration are needed

    Gene and Pathway-Based Analysis: Second Wave of Genome-wide Association Studies

    Get PDF
    Despite great success of GWAS in identification of common genetic variants associated with complex diseases, the current GWAS have focused on single SNP analysis. However, single SNP analysis often identifies a number of the most significant SNPs that account for only a small proportion of the genetic variants and offers limited understanding of complex diseases. To overcome these limitations, we propose gene and pathway-based association analysis as a new paradigm for GWAS. As a proof of concept, we performed a comprehensive gene and pathway-based association analysis for thirteen published GWAS. Our results showed that the proposed new paradigm for GWAS not only identified the genes that include significant SNPs found by single SNP analysis, but also detected new genes in which each single SNP conferred small disease risk, but their joint actions were implicated in the development of diseases. The results also demonstrated that the new paradigm for GWAS was able to identify biologically meaningful pathways associated with the diseases which were confirmed by gene-set rich analysis using gene expression data

    The South Asian genome

    Get PDF
    Genetics of disease Microarrays Variant genotypes Population genetics Sequence alignment AllelesThe genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.Whole genome sequencing to discover genetic variants underlying type-2 diabetes, coronary heart disease and related phenotypes amongst Indian Asians. Imperial College Healthcare NHS Trust cBRC 2011-13 (JS Kooner [PI], JC Chambers)
    corecore