224 research outputs found

    SOP(3)v2: web-based selection of oligonucleotide primer trios for genotyping of human and mouse polymorphisms

    Get PDF
    SOP(3)v2 is a database-driven graphical web-based application for facilitating genotyping assay design. SOP(3)v2 accepts data input in numerous forms, including gene names, reference sequence numbers and physical location. For each entry, the application presents a set of recommended forward and reverse PCR primers, along with a sequencing primer, which is optimized for sequence-based genotyping assays. SOP(3)v2-generated oligonucleotide primer trios enable analysis of single nucleotide polymorphisms (SNPs) as well as insertion/deletion polymorphisms found in genomic DNA. The application's database was generated by warehousing information from the National Center for Biotechnology Information (NCBI) dbSNP database, genomic DNA sequences from human and mouse, and LocusLink gene attribute information. Query results can be sorted by their biological relevance, such as nonsynonymous coding changes or physical location. Human polymorphism queries may specify ethnicity, haplotype and validation status. Primers are developed using SOP(3)v2's core algorithm for evaluating primer candidates through stability tests and are suitable for use with sequence-based genotyping methods requiring locus-specific amplification. The method has undergone laboratory validation. Of the SOP(3)v2-designed primer trios that were tested, a majority (>80%) have successfully produced genotyping data. The application may be accessed via the web at

    Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data

    Get PDF
    Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al

    An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs

    Get PDF
    Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven

    Expression of Regulatory Platelet MicroRNAs in Patients with Sickle Cell Disease

    Get PDF
    Background: Increased platelet activation in sickle cell disease (SCD) contributes to a state of hypercoagulability and confers a risk of thromboembolic complications. The role for post-transcriptional regulation of the platelet transcriptome by microRNAs (miRNAs) in SCD has not been previously explored. This is the first study to determine whether platelets from SCD exhibit an altered miRNA expression profile. Methods and Findings: We analyzed the expression of miRNAs isolated from platelets from a primary cohort (SCD = 19, controls = 10) and a validation cohort (SCD = 7, controls = 7) by hybridizing to the Agilent miRNA microarrays. A dramatic difference in miRNA expression profiles between patients and controls was noted in both cohorts separately. A total of 40 differentially expressed platelet miRNAs were identified as common in both cohorts (p-value 0.05, fold change>2) with 24 miRNAs downregulated. Interestingly, 14 of the 24 downregulated miRNAs were members of three families - miR-329, miR-376 and miR-154 - which localized to the epigenetically regulated, maternally imprinted chromosome 14q32 region. We validated the downregulated miRNAs, miR-376a and miR-409-3p, and an upregulated miR-1225-3p using qRT-PCR. Over-expression of the miR-1225-3p in the Meg01 cells was followed by mRNA expression profiling to identify mRNA targets. This resulted in significant transcriptional repression of 1605 transcripts. A combinatorial approach using Meg01 mRNA expression profiles following miR-1225-3p overexpression, a computational prediction analysis of miRNA target sequences and a previously published set of differentially expressed platelet transcripts from SCD patients, identified three novel platelet mRNA targets: PBXIP1, PLAGL2 and PHF20L1. Conclusions: We have identified significant differences in functionally active platelet miRNAs in patients with SCD as compared to controls. These data provide an important inventory of differentially expressed miRNAs in SCD patients and an experimental framework for future studies of miRNAs as regulators of biological pathways in platelets. © 2013 Jain et al

    Inferring Binding Energies from Selected Binding Sites

    Get PDF
    We employ a biophysical model that accounts for the non-linear relationship between binding energy and the statistics of selected binding sites. The model includes the chemical potential of the transcription factor, non-specific binding affinity of the protein for DNA, as well as sequence-specific parameters that may include non-independent contributions of bases to the interaction. We obtain maximum likelihood estimates for all of the parameters and compare the results to standard probabilistic methods of parameter estimation. On simulated data, where the true energy model is known and samples are generated with a variety of parameter values, we show that our method returns much more accurate estimates of the true parameters and much better predictions of the selected binding site distributions. We also introduce a new high-throughput SELEX (HT-SELEX) procedure to determine the binding specificity of a transcription factor in which the initial randomized library and the selected sites are sequenced with next generation methods that return hundreds of thousands of sites. We show that after a single round of selection our method can estimate binding parameters that give very good fits to the selected site distributions, much better than standard motif identification algorithms

    A Linear Model for Transcription Factor Binding Affinity Prediction in Protein Binding Microarrays

    Get PDF
    Protein binding microarrays (PBM) are a high throughput technology used to characterize protein-DNA binding. The arrays measure a protein's affinity toward thousands of double-stranded DNA sequences at once, producing a comprehensive binding specificity catalog. We present a linear model for predicting the binding affinity of a protein toward DNA sequences based on PBM data. Our model represents the measured intensity of an individual probe as a sum of the binding affinity contributions of the probe's subsequences. These subsequences characterize a DNA binding motif and can be used to predict the intensity of protein binding against arbitrary DNA sequences. Our method was the best performer in the Dialogue for Reverse Engineering Assessments and Methods 5 (DREAM5) transcription factor/DNA motif recognition challenge. For the DREAM5 bonus challenge, we also developed an approach for the identification of transcription factors based on their PBM binding profiles. Our approach for TF identification achieved the best performance in the bonus challenge

    Novel Modeling of Combinatorial miRNA Targeting Identifies SNP with Potential Role in Bone Density

    Get PDF
    MicroRNAs (miRNAs) are post-transcriptional regulators that bind to their target mRNAs through base complementarity. Predicting miRNA targets is a challenging task and various studies showed that existing algorithms suffer from high number of false predictions and low to moderate overlap in their predictions. Until recently, very few algorithms considered the dynamic nature of the interactions, including the effect of less specific interactions, the miRNA expression level, and the effect of combinatorial miRNA binding. Addressing these issues can result in a more accurate miRNA:mRNA modeling with many applications, including efficient miRNA-related SNP evaluation. We present a novel thermodynamic model based on the Fermi-Dirac equation that incorporates miRNA expression in the prediction of target occupancy and we show that it improves the performance of two popular single miRNA target finders. Modeling combinatorial miRNA targeting is a natural extension of this model. Two other algorithms show improved prediction efficiency when combinatorial binding models were considered. ComiR (Combinatorial miRNA targeting), a novel algorithm we developed, incorporates the improved predictions of the four target finders into a single probabilistic score using ensemble learning. Combining target scores of multiple miRNAs using ComiR improves predictions over the naïve method for target combination. ComiR scoring scheme can be used for identification of SNPs affecting miRNA binding. As proof of principle, ComiR identified rs17737058 as disruptive to the miR-488-5p:NCOA1 interaction, which we confirmed in vitro. We also found rs17737058 to be significantly associated with decreased bone mineral density (BMD) in two independent cohorts indicating that the miR-488-5p/NCOA1 regulatory axis is likely critical in maintaining BMD in women. With increasing availability of comprehensive high-throughput datasets from patients ComiR is expected to become an essential tool for miRNA-related studies. © 2012 Coronnello et al

    RNA deep sequencing reveals differential MicroRNA expression during development of sea urchin and sea star

    Get PDF
    microRNAs (miRNAs) are small (20-23 nt), non-coding single stranded RNA molecules that act as post-transcriptional regulators of mRNA gene expression. They have been implicated in regulation of developmental processes in diverse organisms. The echinoderms, Strongylocentrotus purpuratus (sea urchin) and Patiria miniata (sea star) are excellent model organisms for studying development with well-characterized transcriptional networks. However, to date, nothing is known about the role of miRNAs during development in these organisms, except that the genes that are involved in the miRNA biogenesis pathway are expressed during their developmental stages. In this paper, we used Illumina Genome Analyzer (Illumina, Inc.) to sequence small RNA libraries in mixed stage population of embryos from one to three days after fertilization of sea urchin and sea star (total of 22,670,000 reads). Analysis of these data revealed the miRNA populations in these two species. We found that 47 and 38 known miRNAs are expressed in sea urchin and sea star, respectively, during early development (32 in common). We also found 13 potentially novel miRNAs in the sea urchin embryonic library. miRNA expression is generally conserved between the two species during development, but 7 miRNAs are highly expressed in only one species. We expect that our two datasets will be a valuable resource for everyone working in the field of developmental biology and the regulatory networks that affect it. The computational pipeline to analyze Illumina reads is available at http://www.benoslab.pitt.edu/services.html. © 2011 Kadri et al

    Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences

    Get PDF
    We report a high-quality draft of the genome sequence of the grey, short-tailed opossum (Monodelphis domestica). As the first metatherian (\u27marsupial\u27) species to be sequenced, the opossum provides a unique perspective on the organization and evolution of mammalian genomes. Distinctive features of the opossum chromosomes provide support for recent theories about genome evolution and function, including a strong influence of biased gene conversion on nucleotide sequence composition, and a relationship between chromosomal characteristics and X chromosome inactivation. Comparison of opossum and eutherian genomes also reveals a sharp difference in evolutionary innovation between protein-coding and non-coding functional elements. True innovation in protein-coding genes seems to be relatively rare, with lineage-specific differences being largely due to diversification and rapid turnover in gene families involved in environmental interactions. In contrast, about 20% of eutherian conserved non-coding elements (CNEs) are recent inventions that postdate the divergence of Eutheria and Metatheria. A substantial proportion of these eutherian-specific CNEs arose from sequence inserted by transposable elements, pointing to transposons as a major creative force in the evolution of mammalian gene regulation. ©2007 Nature Publishing Group
    • …
    corecore