58 research outputs found

    BRUCE: a program for the detection of transfer-messenger RNA genes in nucleotide sequences

    Get PDF
    A computer program, BRUCE, was developed for the identification of transfer‐messenger RNA (tmRNA) genes. The program employs heuristic algorithms to search for a tRNAAla‐like secondary structure surrounding a short sequence encoding the tag peptide. In the 57 completely sequenced bacterial genomes where tmRNA genes have been reported previously, BRUCE identified all with no false positives. In addition, BRUCE found 99 of the 100 tmRNAs identified previously in other bacteria, red chloroplasts and cyanelles. The output of the program reports the proposed tRNA secondary structure, the tmRNA gene sequence and the tag peptide

    The tmRDB and SRPDB resources

    Get PDF
    Maintained at the University of Texas Health Science Center at Tyler, Texas, the tmRNA database (tmRDB) is accessible at the URL with mirror sites located at Auburn University, Auburn, Alabama () and the Royal Veterinary and Agricultural University, Denmark (). The signal recognition particle database (SRPDB) at is mirrored at and the University of Goteborg (). The databases assist in investigations of the tmRNP (a ribonucleoprotein complex which liberates stalled bacterial ribosomes) and the SRP (a particle which recognizes signal sequences and directs secretory proteins to cell membranes). The curated tmRNA and SRP RNA alignments consider base pairs supported by comparative sequence analysis. Also shown are alignments of the tmRNA-associated proteins SmpB, ribosomal protein S1, alanyl-tRNA synthetase and Elongation Factor Tu, as well as the SRP proteins SRP9, SRP14, SRP19, SRP21, SRP54 (Ffh), SRP68, SRP72, cpSRP43, Flhf, SRP receptor (alpha) and SRP receptor (beta). All alignments can be easily examined using a new exploratory browser. The databases provide links to high-resolution structures and serve as depositories for structures obtained by molecular modeling

    Comparative 3-D Modeling of tmRNA

    Get PDF
    BACKGROUND: Trans-translation releases stalled ribosomes from truncated mRNAs and tags defective proteins for proteolytic degradation using transfer-messenger RNA (tmRNA). This small stable RNA represents a hybrid of tRNA- and mRNA-like domains connected by a variable number of pseudoknots. Comparative sequence analysis of tmRNAs found in bacteria, plastids, and mitochondria provides considerable insights into their secondary structures. Progress toward understanding the molecular mechanism of template switching, which constitutes an essential step in trans-translation, is hampered by our limited knowledge about the three-dimensional folding of tmRNA. RESULTS: To facilitate experimental testing of the molecular intricacies of trans-translation, which often require appropriately modified tmRNA derivatives, we developed a procedure for building three-dimensional models of tmRNA. Using comparative sequence analysis, phylogenetically-supported 2-D structures were obtained to serve as input for the program ERNA-3D. Motifs containing loops and turns were extracted from the known structures of other RNAs and used to improve the tmRNA models. Biologically feasible 3-D models for the entire tmRNA molecule could be obtained. The models were characterized by a functionally significant close proximity between the tRNA-like domain and the resume codon. Potential conformational changes which might lead to a more open structure of tmRNA upon binding to the ribosome are discussed. The method, described in detail for the tmRNAs of Escherichia coli, Bacillus anthracis, and Caulobacter crescentus, is applicable to every tmRNA. CONCLUSION: Improved molecular models of biological significance were obtained. These models will guide in the design of experiments and provide a better understanding of trans-translation. The comparative procedure described here for tmRNA is easily adopted for the modeling the members of other RNA families

    Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved.</p> <p>Results</p> <p>This paper describes an algorithm, <it>SSCA</it>, which measures the suitability of sequences for the comparative approach. It is based on evolutionary models with structure constraints, particularly those on sequence variations and stem alignment. We propose three models, based on different constraints on sequence alignments. We show the results of the <it>SSCA </it>algorithm for predicting the secondary structure of several RNAs. <it>SSCA </it>enabled us to choose sets of homologous sequences that gave better predictions than arbitrarily chosen sets of homologous sequences.</p> <p>Conclusion</p> <p><it>SSCA </it>is an algorithm for selecting combinations of RNA homologous sequences suitable for secondary structure predictions with the comparative approach.</p

    Utilization of tmRNA sequences for bacterial identification

    No full text
    In recent years, molecular approaches based on nucleotide sequences of ribosomal RNA (rRNA) have become widely used tools for identification of bacteria [1-4]. The high degree of evolutionary conservation makes 16S and 23S rRNA molecules very suitable for phylogenetic studies above the species level [3-5]. More than 16,000 sequences of 16S rRNA are presently available in public databases [4,6]. The 16S rRNA sequences are commonly used to design fluorescently labeled oligonucleotide probes. Fluorescence in situ hybridization (FISH) with these probes followed by observation with epifluorescence microscopy allows the identification of a specific microorganism in a mixture with other bacteria [2-4]. By shifting probe target sites from conservative to increasingly variable regions of rRNA, it is possible to adjust the probe specificity from kingdom to species level. Nevertheless, 16S rRNA sequences of closely related strains, subspecies, or even of different species are often identical and therefore can not be used as differentiating markers [3]. Another restriction concerns the accessibility of target sites to the probe in FISH experiments. The presence of secondary structures, or protection of rRNA segments by ribosomal proteins in fixed cells can limit the choice of variable regions as in situ targets for oligonucleotide probes [7,8]. One way to overcome the limitations of in situ identification of bacteria is to use molecules other than rRNA for phylogenetic identification of bacteria, for which nucleotide sequences would be sufficiently divergent to design species specific probes, and which would be more accessible to oligonucleotide probes. For this purpose we investigated the possibility of using tmRNA (also known as 10Sa RNA; [9-11]). This molecule was discovered in E. coli and described as small stable RNA, present at ~1,000 copies per cell [9,11]. The high copy number is an important prerequisite for FISH, which works best with naturally amplified target molecules. In E. coli, tmRNA is encoded by the ssrA gene, is 363 nucleotides long and has properties of tRNA and mRNA [12,13]. tmRNA was shown to be involved in the degradation of truncated proteins: the tmRNA associates with ribosomes stalled on mRNAs lacking stop codons, finally resulting in the addition of a C-terminal peptide tag to the truncated protein. The peptide tag directs the abnormal protein to proteolysis [14,15]. 165 tmRNA sequences have so far (August 2001; The tmRNA Website: http://www.indiana.edu/~tmrna/) been determined [16,17]. The tmRNA is likely to be present in all bacteria and has also been found in algae chloroplasts, the cyanelle of Cyanophora paradoxa and the mitochondrion of the flagellate Reclinomonas americana[10,17,18]

    McGenus: A Monte Carlo algorithm to predict RNA secondary structures with pseudoknots

    Get PDF
    We present McGenus, an algorithm to predict RNA secondary structures with pseudoknots. The method is based on a classification of RNA structures according to their topological genus. McGenus can treat sequences of up to 1000 bases and performs an advanced stochastic search of their minimum free energy structure allowing for non trivial pseudoknot topologies. Specifically, McGenus employs a multiple Markov chain scheme for minimizing a general scoring function which includes not only free energy contributions for pair stacking, loop penalties, etc. but also a phenomenological penalty for the genus of the pairing graph. The good performance of the stochastic search strategy was successfully validated against TT2NE which uses the same free energy parametrization and performs exhaustive or partially exhaustive structure search, albeit for much shorter sequences (up to 200 bases). Next, the method was applied to other RNA sets, including an extensive tmRNA database, yielding results that are competitive with existing algorithms. Finally, it is shown that McGenus highlights possible limitations in the free energy scoring function. The algorithm is available as a web-server at http://ipht.cea.fr/rna/mcgenus.php .Comment: 6 pages, 1 figur

    RNAcentral: A vision for an international database of RNA sequences

    Get PDF
    During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor

    Rapid real-time PCR detection of Listeria monocytogenes in enriched food samples based on the ssrA gene, a novel diagnostic target

    Get PDF
    A real-time PCR assay was designed to detect a 162-bp fragment of the ssrA gene in Listeria monocytogenes. The specificity of the assay for L. monocytogenes was confirmed against a panel of 6 Listeria species and 26 other bacterial species. A detection limit of 1-10 genome equivalents was determined for the assay. Application of the assay in natural and artificially contaminated culture enriched foods, including soft cheese, meat, milk, vegetables and fish, enabled detection of 1-5 CFU L. monocytogenes per 25g/ml of food sample in 30h. The performance of the assay was compared with the Roche Diagnostics 'LightCycler foodproof Listeria monocytogenes Detection Kit'. Both methods detected L. monocytogenes in all artificially contaminated retail samples (n=27) and L. monocytogenes was not detected by either system in 27 natural retail food samples. The method developed in this study has the potential to enable the specific detection of L. monocytogenes in a variety of food types in a time-frame considerably faster than current standard methods. The potential of the ssrA gene as a nucleic acid diagnostic (NAD) target has been demonstrated in L. monocytogenes. We are currently developing NAD tests based on the ssrA gene for a range of common foodborne and clinically relevant bacterial pathogens

    TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences

    Get PDF
    Abstract Background The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented. Results TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms. Conclusions TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.</p

    NONCODE: an integrated knowledge database of non-coding RNAs

    Get PDF
    NONCODE is an integrated knowledge database dedicated to non-coding RNAs (ncRNAs), that is to say, RNAs that function without being translated into proteins. All ncRNAs in NONCODE were filtered automatically from literature and GenBank, and were later manually curated. The distinctive features of NONCODE are as follows: (i) the ncRNAs in NONCODE include almost all the types of ncRNAs, except transfer RNAs and ribosomal RNAs. (ii) All ncRNA sequences and their related information (e.g. function, cellular role, cellular location, chromosomal information, etc.) in NONCODE have been confirmed manually by consulting relevant literature: more than 80% of the entries are based on experimental data. (iii) Based on the cellular process and function, which a given ncRNA is involved in, we introduced a novel classification system, labeled process function class, to integrate existing classification systems. (iv) In addition, some 1100 ncRNAs have been grouped into nine other classes according to whether they are specific to gender or tissue or associated with tumors and diseases, etc. (v) NONCODE provides a user-friendly interface, a visualization platform and a convenient search option, allowing efficient recovery of sequence, regulatory elements in the flanking sequences, secondary structure, related publications and other information. The first release of NONCODE (v1.0) contains 5339 non-redundant sequences from 861 organisms, including eukaryotes, eubacteria, archaebacteria, virus and viroids. Access is free for all users through a web interface at http://noncode.bioinfo.org.cn
    corecore