94 research outputs found

    SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the last five years large online resources of human variability have appeared, notably HapMap, Perlegen and the CEPH foundation. These databases of genotypes with population information act as catalogues of human diversity, and are widely used as reference sources for population genetics studies. Although many useful conclusions may be extracted by querying databases individually, the lack of flexibility for combining data from within and between each database does not allow the calculation of key population variability statistics.</p> <p>Results</p> <p>We have developed a novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNPs) in widespread use in human population genetics: SPSmart (SNPs for Population Studies). A fast pipeline creates and maintains a data mart from the most commonly accessed databases of genotypes containing population information: data is mined, summarized into the standard statistical reference indices, and stored into a relational database that currently handles as many as 4 × 10<sup>9 </sup>genotypes and that can be easily extended to new database initiatives. We have also built a web interface to the data mart that allows the browsing of underlying data indexed by population and the combining of populations, allowing intuitive and straightforward comparison of population groups. All the information served is optimized for web display, and most of the computations are already pre-processed in the data mart to speed up the data browsing and any computational treatment requested.</p> <p>Conclusion</p> <p>In practice, SPSmart allows populations to be combined into user-defined groups, while multiple databases can be accessed and compared in a few simple steps from a single query. It performs the queries rapidly and gives straightforward graphical summaries of SNP population variability through visual inspection of allele frequencies outlined in standard pie-chart format. In addition, full numerical description of the data is output in statistical results panels that include common population genetics metrics such as heterozygosity, <it>Fst </it>and <it>In</it>.</p

    interPopula: a Python API to access the HapMap Project dataset

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The HapMap project is a publicly available catalogue of common genetic variants that occur in humans, currently including several million SNPs across 1115 individuals spanning 11 different populations. This important database does not provide any programmatic access to the dataset, furthermore no standard relational database interface is provided.</p> <p>Results</p> <p>interPopula is a Python API to access the HapMap dataset. interPopula provides integration facilities with both the Python ecology of software (e.g. Biopython and matplotlib) and other relevant human population datasets (e.g. Ensembl gene annotation and UCSC Known Genes). A set of guidelines and code examples to address possible inconsistencies across heterogeneous data sources is also provided.</p> <p>Conclusions</p> <p>interPopula is a straightforward and flexible Python API that facilitates the construction of scripts and applications that require access to the HapMap dataset.</p

    SNP selection for genes of iron metabolism in a study of genetic modifiers of hemochromatosis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We report our experience of selecting tag SNPs in 35 genes involved in iron metabolism in a cohort study seeking to discover genetic modifiers of hereditary hemochromatosis.</p> <p>Methods</p> <p>We combined our own and publicly available resequencing data with HapMap to maximise our coverage to select 384 SNPs in candidate genes suitable for typing on the Illumina platform.</p> <p>Results</p> <p>Validation/design scores above 0.6 were not strongly correlated with SNP performance as estimated by Gentrain score. We contrasted results from two tag SNP selection algorithms, LDselect and Tagger. Varying r<sup>2 </sup>from 0.5 to 1.0 produced a near linear correlation with the number of tag SNPs required. We examined the pattern of linkage disequilibrium of three levels of resequencing coverage for the transferrin gene and found HapMap phase 1 tag SNPs capture 45% of the ≥ 3% MAF SNPs found in SeattleSNPs where there is nearly complete resequencing. Resequencing can reveal adjacent SNPs (within 60 bp) which may affect assay performance. We report the number of SNPs present within the region of six of our larger candidate genes, for different versions of stock genotyping assays.</p> <p>Conclusion</p> <p>A candidate gene approach should seek to maximise coverage, and this can be improved by adding to HapMap data any available sequencing data. Tag SNP software must be fast and flexible to data changes, since tag SNP selection involves iteration as investigators seek to satisfy the competing demands of coverage within and between populations, and typability on the technology platform chosen.</p

    Population based allele frequencies of disease associated polymorphisms in the Personalized Medicine Research Project

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There is a lack of knowledge regarding the frequency of disease associated polymorphisms in populations and population attributable risk for many populations remains unknown. Factors that could affect the association of the allele with disease, either positively or negatively, such as race, ethnicity, and gender, may not be possible to determine without population based allele frequencies.</p> <p>Here we used a panel of 51 polymorphisms previously associated with at least one disease and determined the allele frequencies within the entire Personalized Medicine Research Project population based cohort. We compared these allele frequencies to those in dbSNP and other data sources stratified by race. Differences in allele frequencies between self reported race, region of origin, and sex were determined.</p> <p>Results</p> <p>There were 19544 individuals who self reported a single racial category, 19027 or (97.4%) self reported white Caucasian, and 11205 (57.3%) individuals were female. Of the 11,208 (57%) individuals with an identifiable region of origin 8337 or (74.4%) were German.</p> <p>41 polymorphisms were significantly different between self reported race at the 0.05 level. Stratification of our Caucasian population by self reported region of origin revealed 19 polymorphisms that were significantly different (p = 0.05) between individuals of different origins. Further stratification of the population by gender revealed few significant differences in allele frequencies between the genders.</p> <p>Conclusions</p> <p>This represents one of the largest population based allele frequency studies to date. Stratification by self reported race and region of origin revealed wide differences in allele frequencies not only by race but also by region of origin within a single racial group. We report allele frequencies for our Asian/Hmong and American Indian populations; these two minority groups are not typically selected for population allele frequency detection. Population wide allele frequencies are important for the design and implementation of studies and for determining the relevance of a disease associated polymorphism for a given population.</p

    PADB : Published Association Database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although molecular pathway information and the International HapMap Project data can help biomedical researchers to investigate the aetiology of complex diseases more effectively, such information is missing or insufficient in current genetic association databases. In addition, only a few of the environmental risk factors are included as gene-environment interactions, and the risk measures of associations are not indexed in any association databases.</p> <p>Description</p> <p>We have developed a published association database (PADB; <url>http://www.medclue.com/padb</url>) that includes both the genetic associations and the environmental risk factors available in PubMed database. Each genetic risk factor is linked to a molecular pathway database and the HapMap database through human gene symbols identified in the abstracts. And the risk measures such as odds ratios or hazard ratios are extracted automatically from the abstracts when available. Thus, users can review the association data sorted by the risk measures, and genetic associations can be grouped by human genes or molecular pathways. The search results can also be saved to tab-delimited text files for further sorting or analysis. Currently, PADB indexes more than 1,500,000 PubMed abstracts that include 3442 human genes, 461 molecular pathways and about 190,000 risk measures ranging from 0.00001 to 4878.9.</p> <p>Conclusion</p> <p>PADB is a unique online database of published associations that will serve as a novel and powerful resource for reviewing and interpreting huge association data of complex human diseases.</p

    rs4919510 in hsa-mir-608 Is Associated with Outcome but Not Risk of Colorectal Cancer

    Get PDF
    Colorectal cancer is the third most incident cancer and cause of cancer-related death in the United States. MicroRNAs, a class of small non-coding RNAs, have been implicated in the pathogenesis and prognosis of colorectal cancer, although few studies have examined the relationship between germline mutation in the microRNAs with risk and prognosis. We therefore investigated the association between a SNP in hsa-mir-608, which lies within the 10q24 locus, and colorectal cancer.A cohort consisting of 245 cases and 446 controls was genotyped for rs4919510. The frequency of the GG genotype was significantly higher in African Americans (15%) compared to Caucasians (3%) controls. There was no significant association between rs4919510 and colorectal cancer risk (African American: OR(GG vs. CC) 0.89 [95% CI, 0.41-1.80]) (Caucasian: OR(GG vs. CC) 1.76, ([95% CI, 0.48-6.39]). However, we did observe an association with survival. The GG genotype was associated with an increased risk of death in Caucasians (HR(GG vs. CC) 3.54 ([95% CI, 1.38-9.12]) and with a reduced risk of death in African Americans (HR(GG vs. CC) 0.36 ([95% CI 0.12-1.07).These results suggest that rs4910510 may be associated with colorectal cancer survival in a manner that is dependent on race

    Glutathione pathway gene variation and risk of autism spectrum disorders

    Get PDF
    Despite evidence that autism is highly heritable with estimates of 15 or more genes involved, few studies have directly examined associations of multiple gene interactions. Since inability to effectively combat oxidative stress has been suggested as a mechanism of autism, we examined genetic variation 42 genes (308 single-nucleotide polymorphisms (SNPs)) related to glutathione, the most important antioxidant in the brain, for both marginal association and multi-gene interaction among 318 case–parent trios from The Autism Genetic Resource Exchange. Models of multi-SNP interactions were estimated using the trio Logic Regression method. A three-SNP joint effect was observed for genotype combinations of SNPs in glutaredoxin, glutaredoxin 3 (GLRX3), and cystathione gamma lyase (CTH); OR = 3.78, 95% CI: 2.36, 6.04. Marginal associations were observed for four genes including two involved in the three-way interaction: CTH, alcohol dehydrogenase 5, gamma-glutamylcysteine synthetase, catalytic subunit and GLRX3. These results suggest that variation in genes involved in counterbalancing oxidative stress may contribute to autism, though replication is necessary

    Genetic influences on attention deficit hyperactivity disorder symptoms from age 2 to 3: A quantitative and molecular genetic investigation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A twin study design was used to assess the degree to which additive genetic variance influences ADHD symptom scores across two ages during infancy. A further objective in the study was to observe whether genetic association with a number of candidate markers reflects results from the quantitative genetic analysis.</p> <p>Method</p> <p>We have studied 312 twin pairs at two time-points, age 2 and age 3. A composite measure of ADHD symptoms from two parent-rating scales: The Child Behavior Checklist/1.5 - 5 years (CBCL) hyperactivity scale and the Revised Rutter Parent Scale for Preschool Children (RRPSPC) was used for both quantitative and molecular genetic analyses.</p> <p>Results</p> <p>At ages 2 and 3 ADHD symptoms are highly heritable (<it>h</it><sup><it>2 </it></sup><it>= </it>0.79 and 0.78, respectively) with a high level of genetic stability across these ages. However, we also observe a significant level of genetic change from age 2 to age 3. There are modest influences of non-shared environment at each age independently (<it>e</it><sup><it>2 </it></sup>= 0.22 and 0.21, respectively), with these influences being largely age-specific. In addition, we find modest association signals in <it>DAT1 </it>and <it>NET1 </it>at both ages, along with suggestive specific effects of <it>5-HTT </it>and <it>DRD4 </it>at age 3.</p> <p>Conclusions</p> <p>ADHD symptoms are heritable at ages 2 and 3. Additive genetic variance is largely shared across these ages, although there are significant new effects emerging at age 3. Results from our genetic association analysis reflect these levels of stability and change and, more generally, suggest a requirement for consideration of age-specific genotypic effects in future molecular studies.</p

    Toll-like receptor gene polymorphisms are associated with susceptibility to graves' ophthalmopathy in Taiwan males

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Toll-like receptors (TLRs) are a family of pattern-recognition receptors, which plays a role in eliciting innate/adaptive immune responses and developing chronic inflammation. The polymorphisms of TLRs have been associated with the risk of various autoimmune diseases, including systemic lupus erythematosus (SLE), multiple sclerosis and rheumatorid arthritis. The aim of this study was to evaluate whether TLR genes could be used as genetic markers for the development of Graves' ophthalmopathy (GO).</p> <p>Methods</p> <p>6 TLR-4 and 2 TLR-9 gene polymorphisms in 471 GD patients (200 patients with GO and 271 patients without GO) from a Taiwan Chinese population were evaluated.</p> <p>Results</p> <p>No statistically significant difference was observed in the genotypic and allelic frequencies of TLR-4 and TLR-9 gene polymorphisms between the GD patients with and without GO. However, sex-stratified analyses showed that the association between TLR-9 gene polymorphism and GO phenotype was more pronounced in the male patients. The odds ratios (ORs) was 2.11 (95% confidence interval [CI] = 1.14-3.91) for rs187084 AàG polymorphism and 1.97 (95% CI = 1.07-3.62) for rs352140 AàG polymorphism among the male patients. Increasing one G allele of rs287084 and one A allele of rs352140 increased the risk of GO (<it>p </it>values for trend tests were 0.0195 and 0.0345, respectively). Further, in haplotype analyses, the male patients carrying the GA haplotype had a higher risk of GO (odds ratio [OR] = 2.02, 95% confidence interval [CI] = 1.09-3.73) than those not carrying the GA haplotype.</p> <p>Conclusion</p> <p>The present data suggest that TLR-9 gene polymorphisms were significantly associated with increased susceptibility of ophthalmopathy in male GD patients.</p
    corecore