48 research outputs found

    Structural analysis of aligned RNAs

    Get PDF
    The knowledge about classes of non-coding RNAs (ncRNAs) is growing very fast and it is mainly the structure which is the common characteristic property shared by members of the same class. For correct characterization of such classes it is therefore of great importance to analyse the structural features in great detail. In this manuscript I present RNAlishapes which combines various secondary structure analysis methods, such as suboptimal folding and shape abstraction, with a comparative approach known as RNA alignment folding. RNAlishapes makes use of an extended thermodynamic model and covariance scoring, which allows to reward covariation of paired bases. Applying the algorithm to a set of bacterial trp-operon leaders using shape abstraction it was able to identify the two alternating conformations of this attenuator. Besides providing in-depth analysis methods for aligned RNAs, the tool also shows a fairly well prediction accuracy. Therefore, RNAlishapes provides the community with a powerful tool for structural analysis of classes of RNAs and is also a reasonable method for consensus structure prediction based on sequence alignments. RNAlishapes is available for online use and download at

    Advanced tools for RNA secondary structure analysis

    Get PDF
    Voß B. Advanced tools for RNA secondary structure analysis. Bielefeld (Germany): Bielefeld University; 2004.The analysis of RNA secondary structure has become more and more important throughout the last decades after it was recognised that RNA does not only serve as a passive messenger (mRNA), but also as a functional compound of the cell. Furthermore, it was elucidated that mainly the structure rather than the sequence determines the function of such non-protein-coding RNA. This means that two RNA molecules which have low sequence similarity but high structure similarity are likely to have a similar function. The prediction of RNA secondary structure is based on parameters that have been measured in vitro. This results in rather static parameters, that do not incorporate the dynamic change of environment occurring in living organisms. Nevertheless, the use of these parameters, that are summarised in the energy model, gave valuable results, especially for short sequences. Several refinements throughout the years improved the predictions, but still the calculated optimal structure is not guaranteed to correspond to the native one. In this case, and due to the fact that the native structure is feasible under the energy model, it is common practice to additionally calculate suboptimal structures and incorporate these in the study. The set of all suboptimal structures is referred to as the structure space, which actually holds the information needed to answer questions such as: Is the optimal structure also the native one? Are there more than one structure an RNA molecule can adopt? How well-defined is the optimal structure? Major problems in the analysis of the structure space are its size and its shape. The number of suboptimal structures is exponential in the sequence length, which means that for sequences of moderate length the size quickly exceeds several billion. Besides the size, the appearance of the structure space complicates its study. The structure space can be imagined as a rough landscape with valleys, holding local optimal structures, separated by mountains and saddles. This landscape is not smooth but cliffy and complex, which prevents the development of a practical and still intuitive visualisation. In general, the intention of structure space analysis is not its visualisation, but its complexity also hampers approaches to derive specific features hidden in the structure space. Despite these problems, several tools exist that analyse the complete structure space or at least a part of it to answer the aforementioned questions. Among these are MFOLD which produces a subset of all possible structures according to a threshold of structural similarity, SFOLD which samples the structures in a probabilistic fashion and provides a method to identify alternating structures, RNAsubopt to produce all suboptimal structures within a given energy threshold, barriers to identify valleys, mountains and saddles of the structure landscape, and others. My contribution to this area of research is twofold: First, I present paRNAss (prediction of alternating RNA secondary structures) which focuses on the detection of conformational switches and analyses the structure space based on pairwise comparisons. paRNAss has been available since 1997 and I could improve its predictive power as well as its speed which made possible a systematic evaluation. During this evaluation it turned out that paRNAss can even be used to identify more than two competing structures and hence get a deeper insight into the structure space. The second tool I introduce is RNAshapes which facilitates different kinds of analyses. The algorithm makes use of abstract representations of the secondary structure to compute only those that are morphologically dissimilar, i.e. are composed of different structural elements. Structures being morphologically similar are pooled in a class of structures and each class is represented by its best member. The list of these representatives gives a general overview of what is there in the structure space. In addition to this, I introduce an algorithm to compute probabilities of the aforementioned classes of structures. This gives hints to properties such as alternating secondary structures (two classes with similar probabilities) and structural well-definedness (one class with very high probability)

    Complete probabilistic analysis of RNA shapes

    Get PDF
    BACKGROUND: Soon after the first algorithms for RNA folding became available, it was recognised that the prediction of only one energetically optimal structure is insufficient to achieve reliable results. An in-depth analysis of the folding space as a whole appeared necessary to deduce the structural properties of a given RNA molecule reliably. Folding space analysis comprises various methods such as suboptimal folding, computation of base pair probabilities, sampling procedures and abstract shape analysis. Common to many approaches is the idea of partitioning the folding space into classes of structures, for which certain properties can be derived. RESULTS: In this paper we extend the approach of abstract shape analysis. We show how to compute the accumulated probabilities of all structures that share the same shape. While this implies a complete (non-heuristic) analysis of the folding space, the computational effort depends only on the size of the shape space, which is much smaller. This approach has been integrated into the tool RNAshapes, and we apply it to various RNAs. CONCLUSION: Analyses of conformational switches show the existence of two shapes with probabilities approximately [Formula: see text] vs. [Formula: see text] , whereas the analysis of a microRNA precursor reveals one shape with a probability near to 1.0. Furthermore, it is shown that a shape can outperform an energetically more favourable one by achieving a higher probability. From these results, and the fact that we use a complete and exact analysis of the folding space, we conclude that this approach opens up new and promising routes for investigating and understanding RNA secondary structure

    A motif-based search in bacterial genomes identifies the ortholog of the small RNA Yfr1 in all lineages of cyanobacteria

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Non-coding RNAs (ncRNA) are regulators of gene expression in all domains of life. They control growth and differentiation, virulence, motility and various stress responses. The identification of ncRNAs can be a tedious process due to the heterogeneous nature of this molecule class and the missing sequence similarity of orthologs, even among closely related species. The small ncRNA Yfr1 has previously been found in the <it>Prochlorococcus/Synechococcus </it>group of marine cyanobacteria.</p> <p>Results</p> <p>Here we show that screening available genome sequences based on an RNA motif and followed by experimental analysis works successfully in detecting this RNA in all lineages of cyanobacteria. Yfr1 is an abundant ncRNA between 54 and 69 nt in size that is ubiquitous for cyanobacteria except for two low light-adapted strains of <it>Prochlorococcus</it>, MIT 9211 and SS120, in which it must have been lost secondarily. Yfr1 consists of two predicted stem-loop elements separated by an unpaired sequence of 16–20 nucleotides containing the ultraconserved undecanucleotide 5'-ACUCCUCACAC-3'.</p> <p>Conclusion</p> <p>Starting with an ncRNA previously found in a narrow group of cyanobacteria only, we show here the highly specific and sensitive identification of its homologs within all lineages of cyanobacteria, whereas it was not detected within the genome sequences of <it>E. coli </it>and of 7 other eubacteria belonging to the alpha-proteobacteria, chlorobiaceae and spirochaete. The integration of RNA motif prediction into computational pipelines for the detection of ncRNAs in bacteria appears as a promising step to improve the quality of such predictions.</p

    Biocomputational prediction of non-coding RNAs in model cyanobacteria

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In bacteria, non-coding RNAs (ncRNA) are crucial regulators of gene expression, controlling various stress responses, virulence, and motility. Previous work revealed a relatively high number of ncRNAs in some marine cyanobacteria. However, for efficient genetic and biochemical analysis it would be desirable to identify a set of ncRNA candidate genes in model cyanobacteria that are easy to manipulate and for which extended mutant, transcriptomic and proteomic data sets are available.</p> <p>Results</p> <p>Here we have used comparative genome analysis for the biocomputational prediction of ncRNA genes and other sequence/structure-conserved elements in intergenic regions of the three unicellular model cyanobacteria <it>Synechocystis </it>PCC6803, <it>Synechococcus elongatus </it>PCC6301 and <it>Thermosynechococcus elongatus </it>BP1 plus the toxic <it>Microcystis aeruginosa </it>NIES843. The unfiltered numbers of predicted elements in these strains is 383, 168, 168, and 809, respectively, combined into 443 sequence clusters, whereas the numbers of individual elements with high support are 94, 56, 64, and 406, respectively. Removing also transposon-associated repeats, finally 78, 53, 42 and 168 sequences, respectively, are left belonging to 109 different clusters in the data set. Experimental analysis of selected ncRNA candidates in <it>Synechocystis </it>PCC6803 validated new ncRNAs originating from the <it>fabF-hoxH </it>and <it>apcC-prmA </it>intergenic spacers and three highly expressed ncRNAs belonging to the Yfr2 family of ncRNAs. Yfr2a promoter-<it>luxAB </it>fusions confirmed a very strong activity of this promoter and indicated a stimulation of expression if the cultures were exposed to elevated light intensities.</p> <p>Conclusion</p> <p>Comparison to entries in Rfam and experimental testing of selected ncRNA candidates in <it>Synechocystis </it>PCC6803 indicate a high reliability of the current prediction, despite some contamination by the high number of repetitive sequences in some of these species. In particular, we identified in the four species altogether 8 new ncRNA homologs belonging to the Yfr2 family of ncRNAs. Modelling of RNA secondary structures indicated two conserved single-stranded sequence motifs that might be involved in RNA-protein interactions or in the recognition of target RNAs. Since our analysis has been restricted to find ncRNA candidates with a reasonable high degree of conservation among these four cyanobacteria, there might be many more, requiring direct experimental approaches for their identification.</p

    Evidence for the rapid expansion of microRNA-mediated regulation in early land plant evolution

    Get PDF
    BACKGROUND: MicroRNAs (miRNAs) are regulatory RNA molecules that are specified by their mode of action, the structure of primary transcripts, and their typical size of 20–24 nucleotides. Frequently, not only single miRNAs but whole families of closely related miRNAs have been found in animals and plants. Some families are widely conserved among different plant taxa. Hence, it is evident that these conserved miRNAs are of ancient origin and indicate essential functions that have been preserved over long evolutionary time scales. In contrast, other miRNAs seem to be species-specific and consequently must possess very distinct functions. Thus, the analysis of an early-branching species provides a window into the early evolution of fundamental regulatory processes in plants. RESULTS: Based on a combined experimental-computational approach, we report on the identification of 48 novel miRNAs and their putative targets in the moss Physcomitrella patens. From these, 18 miRNAs and two targets were verified in independent experiments. As a result of our study, the number of known miRNAs in Physcomitrella has been raised to 78. Functional assignments to mRNAs targeted by these miRNAs revealed a bias towards genes that are involved in regulation, cell wall biosynthesis and defense. Eight miRNAs were detected with different expression in protonema and gametophore tissue. The miRNAs 1–50 and 2–51 are located on a shared precursor that are separated by only one nucleotide and become processed in a tissue-specific way. CONCLUSION: Our data provide evidence for a surprisingly diverse and complex miRNA population in Physcomitrella. Thus, the number and function of miRNAs must have significantly expanded during the evolution of early land plants. As we have described here within, the coupled maturation of two miRNAs from a shared precursor has not been previously identified in plants

    Evidence for a major role of antisense RNAs in cyanobacterial gene regulation

    Get PDF
    Information on the numbers and functions of naturally occurring antisense RNAs (asRNAs) in eubacteria has thus far remained incomplete. Here, we screened the model cyanobacterium Synechocystis sp. PCC 6803 for asRNAs using four different methods. In the final data set, the number of known noncoding RNAs rose from 6 earlier identified to 60 and of asRNAs from 1 to 73 (28 were verified using at least three methods). Among these, there are many asRNAs to housekeeping, regulatory or metabolic genes, as well as to genes encoding electron transport proteins. Transferring cultures to high light, carbon-limited conditions or darkness influenced the expression levels of several asRNAs, suggesting their functional relevance. Examples include the asRNA to rpl1, which accumulates in a light-dependent manner and may be required for processing the L11 r-operon and the SyR7 noncoding RNA, which is antisense to the murF 5′ UTR, possibly modulating murein biosynthesis. Extrapolated to the whole genome, ∼10% of all genes in Synechocystis are influenced by asRNAs. Thus, chromosomally encoded asRNAs may have an important function in eubacterial regulatory networks

    Pseudomonas putida KT2440 is naturally endowed to withstand industrial-scale stress conditions

    Get PDF
    Pseudomonas putida is recognized as a very promising strain for industrial application due to its high redox capacity and frequently observed tolerance towards organic solvents. In this research, we studied the metabolic and transcriptional response of P. putida KT2440 exposed to large-scale heterogeneous mixing conditions in the form of repeated glucose shortage. Cellular responses were mimicked in an experimental setup comprising a stirred tank reactor and a connected plug flow reactor. We deciphered that a stringent response-like transcriptional regulation programme is frequently induced, which seems to be linked to the intracellular pool of 3-hydroxyalkanoates (3-HA) that are known to serve as precursors for polyhydroxyalkanoates (PHA). To be precise, P. putida is endowed with a survival strategy likely to access cellular PHA, amino acids and glycogen in few seconds under glucose starvation to obtain ATP from respiration, thereby replenishing the reduced ATP levels and the adenylate energy charge. Notably, cells only need 0.4% of glucose uptake to build those 3-HA-based energy buffers. Concomitantly, genes that are related to amino acid catabolism and β-oxidation are upregulated during the transient absence of glucose. Furthermore, we provide a detailed list of transcriptional short- and long-term responses that increase the cellular maintenance by about 17% under the industrial-like conditions tested.publishersversionpublishe

    Simulation of Folding Kinetics for Aligned RNAs

    No full text
    Studying the folding kinetics of an RNA can provide insight into its function and is thus a valuable method for RNA analyses. Computational approaches to the simulation of folding kinetics suffer from the exponentially large folding space that needs to be evaluated. Here, we present a new approach that combines structure abstraction with evolutionary conservation to restrict the analysis to common parts of folding spaces of related RNAs. The resulting algorithm can recapitulate the folding kinetics known for single RNAs and is able to analyse even long RNAs in reasonable time. Our program RNAliHiKinetics is the first algorithm for the simulation of consensus folding kinetics and addresses a long-standing problem in a new and unique way
    corecore