170 research outputs found

    In search of causal variants: refining disease association signals using cross-population contrasts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide association (GWA) using large numbers of single nucleotide polymorphisms (SNPs) is now a powerful, state-of-the-art approach to mapping human disease genes. When a GWA study detects association between a SNP and the disease, this signal usually represents association with a set of several highly correlated SNPs in strong linkage disequilibrium. The challenge we address is to distinguish among these correlated loci to highlight potential functional variants and prioritize them for follow-up.</p> <p>Results</p> <p>We implemented a systematic method for testing association across diverse population samples having differing histories and LD patterns, using a logistic regression framework. The hypothesis is that important underlying biological mechanisms are shared across human populations, and we can filter correlated variants by testing for heterogeneity of genetic effects in different population samples. This approach formalizes the descriptive comparison of p-values that has typified similar cross-population fine-mapping studies to date. We applied this method to correlated SNPs in the cholinergic nicotinic receptor gene cluster <it>CHRNA5-CHRNA3-CHRNB4</it>, in a case-control study of cocaine dependence composed of 504 European-American and 583 African-American samples. Of the 10 SNPs genotyped in the r<sup>2 </sup>≥ 0.8 bin for <it>rs16969968</it>, three demonstrated significant cross-population heterogeneity and are filtered from priority follow-up; the remaining SNPs include <it>rs16969968 </it>(heterogeneity p = 0.75). Though the power to filter out rs16969968 is reduced due to the difference in allele frequency in the two groups, the results nevertheless focus attention on a smaller group of SNPs that includes the non-synonymous SNP rs16969968, which retains a similar effect size (odds ratio) across both population samples.</p> <p>Conclusion</p> <p>Filtering out SNPs that demonstrate cross-population heterogeneity enriches for variants more likely to be important and causative. Our approach provides an important and effective tool to help interpret results from the many GWA studies now underway.</p

    Supplementing High-Density SNP Microarrays for Additional Coverage of Disease-Related Genes: Addiction as a Paradigm

    Get PDF
    Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well represented by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions

    Towards barcode markers in Fungi: an intron map of Ascomycota mitochondria

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality. In particular the potential use of mitochondrial DNA species markers has been taken in account. Unfortunately, a serious difficulty in the PCR and bioinformatic surveys is due to the presence of mobile introns in almost all the fungal mitochondrial genes. The aim of this work is to verify the incidence of this phenomenon in Ascomycota, testing, at the same time, a new bioinformatic tool for extracting and managing sequence databases annotations, in order to identify the mitochondrial gene regions where introns are missing so as to propose them as species markers.</p> <p>Methods</p> <p>The general trend towards a large occurrence of introns in the mitochondrial genome of Fungi has been confirmed in Ascomycota by an extensive bioinformatic analysis, performed on all the entries concerning 11 mitochondrial protein coding genes and 2 mitochondrial rRNA (ribosomal RNA) specifying genes, belonging to this phylum, available in public nucleotide sequence databases. A new query approach has been developed to retrieve effectively introns information included in these entries.</p> <p>Results</p> <p>After comparing the new query-based approach with a blast-based procedure, with the aim of designing a faithful Ascomycota mitochondrial intron map, the first method appeared clearly the most accurate. Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals.</p> <p>Conclusion</p> <p>The development of the new query system described here would answer the pressing requirement to improve drastically the bioinformatics support to the DNA Barcode Initiative. The large scale investigation of Ascomycota mitochondrial introns performed through this tool, allowing to exclude the introns-rich sequences from the barcode candidates exploration, could be the first step towards a mitochondrial barcoding strategy for these organisms, similar to the standard approach employed in metazoans.</p

    Chromosome 15q25 (CHRNA3-CHRNA5) Variation Impacts Indirectly on Lung Cancer Risk

    Get PDF
    Genetic variants at the 15q25 CHRNA5-CHRNA3 locus have been shown to influence lung cancer risk however there is controversy as to whether variants have a direct carcinogenic effect on lung cancer risk or impact indirectly through smoking behavior. We have performed a detailed analysis of the 15q25 risk variants rs12914385 and rs8042374 with smoking behavior and lung cancer risk in 4,343 lung cancer cases and 1,479 controls from the Genetic Lung Cancer Predisposition Study (GELCAPS). A strong association between rs12914385 and rs8042374, and lung cancer risk was shown, odds ratios (OR) were 1.44, (95% confidence interval (CI): 1.29–1.62, P = 3.69×10−10) and 1.35 (95% CI: 1.18–1.55, P = 9.99×10−6) respectively. Each copy of risk alleles at rs12914385 and rs8042374 was associated with increased cigarette consumption of 1.0 and 0.9 cigarettes per day (CPD) (P = 5.18×10−5 and P = 5.65×10−3). These genetically determined modest differences in smoking behavior can be shown to be sufficient to account for the 15q25 association with lung cancer risk. To further verify the indirect effect of 15q25 on the risk, we restricted our analysis of lung cancer risk to never-smokers and conducted a meta-analysis of previously published studies of lung cancer risk in never-smokers. Never-smoker studies published in English were ascertained from PubMed stipulating - lung cancer, risk, genome-wide association, candidate genes. Our study and five previously published studies provided data on 2,405 never-smoker lung cancer cases and 7,622 controls. In the pooled analysis no association has been found between the 15q25 variation and lung cancer risk (OR = 1.09, 95% CI: 0.94–1.28). This study affirms the 15q25 association with smoking and is consistent with an indirect link between genotype and lung cancer risk

    The reference human nuclear mitochondrial sequences compilation validated and implemented on the UCSC genome browser

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Eukaryotic nuclear genomes contain fragments of mitochondrial DNA called NumtS (Nuclear mitochondrial Sequences), whose mode and time of insertion, as well as their functional/structural role within the genome are debated issues. Insertion sites match with chromosomal breaks, revealing that micro-deletions usually occurring at non-homologous end joining <it>loci </it>become reduced in presence of NumtS. Some NumtS are involved in recombination events leading to fragment duplication. Moreover, NumtS are polymorphic, a feature that renders them candidates as population markers. Finally, they are a cause of contamination during human mtDNA sequencing, leading to the generation of false heteroplasmies.</p> <p>Results</p> <p>Here we present RHNumtS.2, the most exhaustive human NumtSome catalogue annotating 585 NumtS, 97% of which were here validated in a European individual and in HapMap samples. The NumtS complete dataset and related features have been made available at the UCSC Genome Browser. The produced sequences have been submitted to INSDC databases. The implementation of the RHNumtS.2 tracks within the UCSC Genome Browser has been carried out with the aim to facilitate browsing of the NumtS tracks to be exploited in a wide range of research applications.</p> <p>Conclusions</p> <p>We aimed at providing the scientific community with the most exhaustive overview on the human NumtSome, a resource whose aim is to support several research applications, such as studies concerning human structural variation, diversity, and disease, as well as the detection of false heteroplasmic mtDNA variants. Upon implementation of the NumtS tracks, the application of the BLAT program on the UCSC Genome Browser has now become an additional tool to check for heteroplasmic artefacts, supported by data available through the NumtS tracks.</p

    Associations of Variants in CHRNA5/A3/B4 Gene Cluster with Smoking Behaviors in a Korean Population

    Get PDF
    Multiple genome-wide and targeted association studies reveal a significant association of variants in the CHRNA5-CHRNA3-CHRNB4 (CHRNA5/A3/B4) gene cluster on chromosome 15 with nicotine dependence. The subjects examined in most of these studies had a European origin. However, considering the distinct linkage disequilibrium patterns in European and other ethnic populations, it would be of tremendous interest to determine whether such associations could be replicated in populations of other ethnicities, such as Asians. In this study, we performed comprehensive association and interaction analyses for 32 single-nucleotide polymorphisms (SNPs) in CHRNA5/A3/B4 with smoking initiation (SI), smoking quantity (SQ), and smoking cessation (SC) in a Korean sample (N = 8,842). We found nominally significant associations of 7 SNPs with at least one smoking-related phenotype in the total sample (SI: P = 0.015∼0.023; SQ: P = 0.008∼0.028; SC: P = 0.018∼0.047) and the male sample (SI: P = 0.001∼0.023; SQ: P = 0.001∼0.046; SC: P = 0.01). A spectrum of haplotypes formed by three consecutive SNPs located between rs16969948 in CHRNA5 and rs6495316 in the intergenic region downstream from the 5′ end of CHRNB4 was associated with these three smoking-related phenotypes in both the total and the male sample. Notably, associations of these variants and haplotypes with SC appear to be much weaker than those with SI and SQ. In addition, we performed an interaction analysis of SNPs within the cluster using the generalized multifactor dimensionality reduction method and found a significant interaction of SNPs rs7163730 in LOC123688, rs6495308 in CHRNA3, and rs7166158, rs8043123, and rs11072793 in the intergenic region downstream from the 5′ end of CHRNB4 to be influencing SI in the male sample. Considering that fewer than 5% of the female participants were smokers, we did not perform any analysis on female subjects specifically. Together, our detected associations of variants in the CHRNA5/A3/B4 cluster with SI, SQ, and SC in the Korean smoker samples provide strong evidence for the contribution of this cluster to the etiology of SI, ND, and SC in this Asian population

    An Open Access Database of Genome-wide Association Results

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The number of genome-wide association studies (GWAS) is growing rapidly leading to the discovery and replication of many new disease loci. Combining results from multiple GWAS datasets may potentially strengthen previous conclusions and suggest new disease loci, pathways or pleiotropic genes. However, no database or centralized resource currently exists that contains anywhere near the full scope of GWAS results.</p> <p>Methods</p> <p>We collected available results from 118 GWAS articles into a database of 56,411 significant SNP-phenotype associations and accompanying information, making this database freely available here. In doing so, we met and describe here a number of challenges to creating an open access database of GWAS results. Through preliminary analyses and characterization of available GWAS, we demonstrate the potential to gain new insights by querying a database across GWAS.</p> <p>Results</p> <p>Using a genomic bin-based density analysis to search for highly associated regions of the genome, positive control loci (e.g., MHC loci) were detected with high sensitivity. Likewise, an analysis of highly repeated SNPs across GWAS identified replicated loci (e.g., <it>APOE</it>, <it>LPL</it>). At the same time we identified novel, highly suggestive loci for a variety of traits that did not meet genome-wide significant thresholds in prior analyses, in some cases with strong support from the primary medical genetics literature (<it>SLC16A7, CSMD1, OAS1</it>), suggesting these genes merit further study. Additional adjustment for linkage disequilibrium within most regions with a high density of GWAS associations did not materially alter our findings. Having a centralized database with standardized gene annotation also allowed us to examine the representation of functional gene categories (gene ontologies) containing one or more associations among top GWAS results. Genes relating to cell adhesion functions were highly over-represented among significant associations (p < 4.6 × 10<sup>-14</sup>), a finding which was not perturbed by a sensitivity analysis.</p> <p>Conclusion</p> <p>We provide access to a full gene-annotated GWAS database which could be used for further querying, analyses or integration with other genomic information. We make a number of general observations. Of reported associated SNPs, 40% lie within the boundaries of a RefSeq gene and 68% are within 60 kb of one, indicating a bias toward gene-centricity in the findings. We found considerable heterogeneity in information available from GWAS suggesting the wider community could benefit from standardization and centralization of results reporting.</p

    A New Methodology to Associate SNPs with Human Diseases According to Their Pathway Related Context

    Get PDF
    Genome-wide association studies (GWAS) with hundreds of żthousands of single nucleotide polymorphisms (SNPs) are popular strategies to reveal the genetic basis of human complex diseases. Despite many successes of GWAS, it is well recognized that new analytical approaches have to be integrated to achieve their full potential. Starting with a list of SNPs, found to be associated with disease in GWAS, here we propose a novel methodology to devise functionally important KEGG pathways through the identification of genes within these pathways, where these genes are obtained from SNP analysis. Our methodology is based on functionalization of important SNPs to identify effected genes and disease related pathways. We have tested our methodology on WTCCC Rheumatoid Arthritis (RA) dataset and identified: i) previously known RA related KEGG pathways (e.g., Toll-like receptor signaling, Jak-STAT signaling, Antigen processing, Leukocyte transendothelial migration and MAPK signaling pathways); ii) additional KEGG pathways (e.g., Pathways in cancer, Neurotrophin signaling, Chemokine signaling pathways) as associated with RA. Furthermore, these newly found pathways included genes which are targets of RA-specific drugs. Even though GWAS analysis identifies 14 out of 83 of those drug target genes; newly found functionally important KEGG pathways led to the discovery of 25 out of 83 genes, known to be used as drug targets for the treatment of RA. Among the previously known pathways, we identified additional genes associated with RA (e.g. Antigen processing and presentation, Tight junction). Importantly, within these pathways, the associations between some of these additionally found genes, such as HLA-C, HLA-G, PRKCQ, PRKCZ, TAP1, TAP2 and RA were verified by either OMIM database or by literature retrieved from the NCBI PubMed module. With the whole-genome sequencing on the horizon, we show that the full potential of GWAS can be achieved by integrating pathway and network-oriented analysis and prior knowledge from functional properties of a SNP

    The Nuclear Transcription Factor PKNOX2 Is a Candidate Gene for Substance Dependence in European-Origin Women

    Get PDF
    Substance dependence or addiction is a complex environmental and genetic disorder that results in serious health and socio-economic consequences. Multiple substance dependence categories together, rather than any one individual addiction outcome, may explain the genetic variability of such disorder. In our study, we defined a composite substance dependence phenotype derived from six individual diagnoses: addiction to nicotine, alcohol, marijuana, cocaine, opiates or other drugs as a whole. Using data from several genomewide case-control studies, we identified a strong (Odds ratio  = 1.77) and significant (p-value = 7E-8) association signal with a novel gene, PBX/knotted 1 homeobox 2 (PKNOX2), on chromosome 11 with the composite phenotype in European-origin women. The association signal is not as significant when individual outcomes for addiction are considered, or in males or African-origin population. Our findings underscore the importance of considering multiple addiction types and the importance of considering population and gender stratification when analyzing data with heterogeneous population
    corecore