15,772 research outputs found

    A Survey of Genomic Properties for the Detection of Regulatory Polymorphisms

    Get PDF
    Advances in the computational identification of functional noncoding polymorphisms will aid in cataloging novel determinants of health and identifying genetic variants that explain human evolution. To date, however, the development and evaluation of such techniques has been limited by the availability of known regulatory polymorphisms. We have attempted to address this by assembling, from the literature, a computationally tractable set of regulatory polymorphisms within the ORegAnno database (http://www.oreganno.org). We have further used 104 regulatory single-nucleotide polymorphisms from this set and 951 polymorphisms of unknown function, from 2-kb and 152-bp noncoding upstream regions of genes, to investigate the discriminatory potential of 23 properties related to gene regulation and population genetics. Among the most important properties detected in this region are distance to transcription start site, local repetitive content, sequence conservation, minor and derived allele frequencies, and presence of a CpG island. We further used the entire set of properties to evaluate their collective performance in detecting regulatory polymorphisms. Using a 10-fold cross-validation approach, we were able to achieve a sensitivity and specificity of 0.82 and 0.71, respectively, and we show that this performance is strongly influenced by the distance to the transcription start site

    Significant association of a M129V independent polymorphism in the 5\prime UTR of the PRNP gene with sporadic Creutzfeldt-Jakob disease in a large German case-control study

    Get PDF
    Background: A single nucleotide polymorphism (SNP) in the coding region of the prion protein gene (PRNP) at codon 129 has been repeatedly shown to be an associated factor to sporadic Creutzfeldt-Jakob disease (sCJD), but additional major predisposing DNA variants for sCJD are still unknown. Several previous studies focused on the characterisation of polymorphisms in PRNP and the prion-like doppel gene (PRND), generating contradictory results on relatively small sample sets. Thus, extensive studies are required for validation of the polymorphisms in PRNP and PRND.Methods: We evaluated a set of nine SNPs of PRNP and one SNP of PRND in 593 German sCJD patients and 748 German healthy controls. Genotyping was performed using MALDI-TOF mass spectrometry.Results: In addition to PRNP 129, we detected a significant association between sCJD and allele frequencies of six further PRNP SNPs. No significant association of PRND T174M with sCJD was shown. We observed strong linkage disequilibrium within eight adjacent PRNP SNPs, including PRNP 129. However, the association of sCJD with PRNP 1368 and PRNP 34296 appeared to be independent on the genotype of PRNP 129. We additionally identified the most common haplotypes of PRNP to be over-represented or under-represented in our cohort of patients with sCJD.Conclusion: Our study evaluated previous findings of the association of SNPs in the PRNP and PRND genes in the largest cohorts for association study in sCJD to date, and extends previous findings by defining for the first time the haplotypes associated with sCJD in a large population of the German CJD surveillance study

    An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs

    Get PDF
    Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven

    In Silico Detection of Sequence Variations Modifying Transcriptional Regulation

    Get PDF
    Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation
    corecore