327 research outputs found

    trieFinder: an efficient program for annotating Digital Gene Expression (DGE) tags

    Get PDF
    BACKGROUND: Quantification of a transcriptional profile is a useful way to evaluate the activity of a cell at a given point in time. Although RNA-Seq has revolutionized transcriptional profiling, the costs of RNA-Seq are still significantly higher than microarrays, and often the depth of data delivered from RNA-Seq is in excess of what is needed for simple transcript quantification. Digital Gene Expression (DGE) is a cost-effective, sequence-based approach for simple transcript quantification: by sequencing one read per molecule of RNA, this technique can be used to efficiently count transcripts while obviating the need for transcript-length normalization and reducing the total numbers of reads necessary for accurate quantification. Here, we present trieFinder, a program specifically designed to rapidly map, parse, and annotate DGE tags of various lengths against cDNA and/or genomic sequence databases. RESULTS: The trieFinder algorithm maps DGE tags in a two-step process. First, it scans FASTA files of RefSeq, UniGene, and genomic DNA sequences to create a database of all tags that can be derived from a predefined restriction site. Next, it compares the experimental DGE tags to this tag database, taking advantage of the fact that the tags are stored as a prefix tree, or “trie”, which allows for linear-time searches for exact matches. DGE tags with mismatches are analyzed by recursive calls in the data structure. We find that, in terms of alignment speed, the mapping functionality of trieFinder compares favorably with Bowtie. CONCLUSIONS: trieFinder can quickly provide the user an annotation of the DGE tags from three sources simultaneously, simplifying transcript quantification and novel transcript detection, delivering the data in a simple parsed format, obviating the need to post-process the alignment results. trieFinder is available at http://research.nhgri.nih.gov/software/trieFinder/

    The suboptimal structures find the optimal RNAs: homology search for bacterial non-coding RNAs using suboptimal RNA structures

    Get PDF
    Non-coding RNAs (ncRNAs) are regulatory molecules encoded in the intergenic or intragenic regions of the genome. In prokaryotes, biocomputational identification of homologs of known ncRNAs in other species often fails due to weakly evolutionarily conserved sequences, structures, synteny and genome localization, except in the case of evolutionarily closely related species. To eliminate results from weak conservation, we focused on RNA structure, which is the most conserved ncRNA property. Analysis of the structure of one of the few well-studied bacterial ncRNAs, 6S RNA, demonstrated that unlike optimal and consensus structures, suboptimal structures are capable of capturing RNA homology even in divergent bacterial species. A computational procedure for the identification of homologous ncRNAs using suboptimal structures was created. The suggested procedure was applied to strongly divergent bacterial species and was capable of identifying homologous ncRNAs

    A new scheme to calculate isotope effects

    Get PDF
    We present a new scheme to calculate isotope effects. Only selected frequencies at the target level of theory are calculated. The frequencies are selected by an analysis of the Hessian from a lower level of theory. We obtain accurate isotope effects without calculating the full Hessian at the target level of theory. The calculated frequencies are very accurate. The scheme converges to the correct isotope effect

    Mitome: dynamic and interactive database for comparative mitochondrial genomics in metazoan animals

    Get PDF
    Mitome is a specialized mitochondrial genome database designed for easy comparative analysis of various features of metazoan mitochondrial genomes such as base frequency, A+T skew, codon usage and gene arrangement pattern. A particular function of the database is the automatic reconstruction of phylogenetic relationships among metazoans selected by a user from a taxonomic tree menu based on nucleotide sequences, amino acid sequences or gene arrangement patterns. Mitome also enables us (i) to easily find the taxonomic positions of organisms of which complete mitochondrial genome sequences are publicly available; (ii) to acquire various metazoan mitochondrial genome characteristics through a graphical genome browser; (iii) to search for homology patterns in mitochondrial gene arrangements; (iv) to download nucleotide or amino acid sequences not only of an entire mitochondrial genome but also of each component; and (v) to find interesting references easily through links with PubMed. In order to provide users with a dynamic, responsive, interactive and faster web database, Mitome is constructed using two recently highlighted techniques, Ajax (Asynchronous JavaScript and XML) and Web Services. Mitome has the potential to become very useful in the fields of molecular phylogenetics and evolution and comparative organelle genomics. The database is available at: http://www.mitome.info

    Identification of Neural Crest and Glial Enhancers at the Mouse Sox10 Locus through Transgenesis in Zebrafish

    Get PDF
    Sox10 is a dynamically regulated transcription factor gene that is essential for the development of neural crest–derived and oligodendroglial populations. Developmental genes often require multiple regulatory sequences that integrate discrete and overlapping functions to coordinate their expression. To identify Sox10 cis-regulatory elements, we integrated multiple model systems, including cell-based screens and transposon-mediated transgensis in zebrafish, to scrutinize mammalian conserved, noncoding genomic segments at the mouse Sox10 locus. We demonstrate that eight of 11 Sox10 genomic elements direct reporter gene expression in transgenic zebrafish similar to patterns observed in transgenic mice, despite an absence of observable sequence conservation between mice and zebrafish. Multiple segments direct expression in overlapping populations of neural crest derivatives and glial cells, ranging from pan-Sox10 and pan-neural crest regulatory control to the modulation of expression in subpopulations of Sox10-expressing cells, including developing melanocytes and Schwann cells. Several sequences demonstrate overlapping spatial control, yet direct expression in incompletely overlapping developmental intervals. We were able to partially explain neural crest expression patterns by the presence of head to head SoxE family binding sites within two of the elements. Moreover, we were able to use this transcription factor binding site signature to identify the corresponding zebrafish enhancers in the absence of overall sequence homology. We demonstrate the utility of zebrafish transgenesis as a high-fidelity surrogate in the dissection of mammalian gene regulation, especially those with dynamically controlled developmental expression
    corecore