63 research outputs found

    RNAstrand: reading direction of structured RNAs in multiple sequence alignments

    Get PDF
    <p>Abstract</p> <p>Motivation</p> <p>Genome-wide screens for structured ncRNA genes in mammals, urochordates, and nematodes have predicted thousands of putative ncRNA genes and other structured RNA motifs. A prerequisite for their functional annotation is to determine the reading direction with high precision.</p> <p>Results</p> <p>While folding energies of an RNA and its reverse complement are similar, the differences are sufficient at least in conjunction with substitution patterns to discriminate between structured RNAs and their complements. We present here a support vector machine that reliably classifies the reading direction of a structured RNA from a multiple sequence alignment and provides a considerable improvement in classification accuracy over previous approaches.</p> <p>Software</p> <p>RNAstrand is freely available as a stand-alone tool from <url>http://www.bioinf.uni-leipzig.de/Software/RNAstrand</url> and is also included in the latest release of RNAz, a part of the Vienna RNA Package.</p

    Dinucleotide controlled null models for comparative RNA gene prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak <it>et al</it>. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available.</p> <p>Results</p> <p>We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content.</p> <p>Conclusion</p> <p>SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered.</p> <p>Availability</p> <p>SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: <url>http://sourceforge.net/projects/sissiz</url>.</p

    An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Aligning RNA sequences with low sequence identity has been a challenging problem since such a computation essentially needs an algorithm with high complexities for taking structural conservation into account. Although many sophisticated algorithms for the purpose have been proposed to date, further improvement in efficiency is necessary to accelerate its large-scale applications including non-coding RNA (ncRNA) discovery.</p> <p>Results</p> <p>We developed a new genetic algorithm, Cofolga2, for simultaneously computing pairwise RNA sequence alignment and consensus folding, and benchmarked it using BRAliBase 2.1. The benchmark results showed that our new algorithm is accurate and efficient in both time and memory usage. Then, combining with the originally trained SVM, we applied the new algorithm to novel ncRNA discovery where we compared <it>S. cerevisiae </it>genome with six related genomes in a pairwise manner. By focusing our search to the relatively short regions (50 bp to 2,000 bp) sandwiched by conserved sequences, we successfully predict 714 intergenic and 1,311 sense or antisense ncRNA candidates, which were found in the pairwise alignments with stable consensus secondary structure and low sequence identity (≤ 50%). By comparing with the previous predictions, we found that > 92% of the candidates is novel candidates. The estimated rate of false positives in the predicted candidates is 51%. Twenty-five percent of the intergenic candidates has supports for expression in cell, i.e. their genomic positions overlap those of the experimentally determined transcripts in literature. By manual inspection of the results, moreover, we obtained four multiple alignments with low sequence identity which reveal consensus structures shared by three species/sequences.</p> <p>Conclusion</p> <p>The present method gives an efficient tool complementary to sequence-alignment-based ncRNA finders.</p

    Conserved Secondary Structures in Aspergillus

    Get PDF
    Background: Recent evidence suggests that the number and variety of functional RNAs (ncRNAs as well as cis-acting RNA elements within mRNAs) is much higher than previously thought; thus, the ability to computationally predict and analyze RNAs has taken on new importance. We have computationally studied the secondary structures in an alignment of six Aspergillus genomes. Little is known about the RNAs present in this set of fungi, and this diverse set of genomes has an optimal level of sequence conservation for observing the correlated evolution of base-pairs seen in RNAs. Methodology/Principal Findings: We report the results of a whole-genome search for evolutionarily conserved secondary structures, as well as the results of clustering these predicted secondary structures by structural similarity. We find a total of 7450 predicted secondary structures, including a new predicted,60 bp long hairpin motif found primarily inside introns. We find no evidence for microRNAs. Different types of genomic regions are over-represented in different classes of predicted secondary structures. Exons contain the longest motifs (primarily long, branched hairpins), 59 UTRs primarily contain groupings of short hairpins located near the start codon, and 39 UTRs contain very little secondary structure compared to other regions. There is a large concentration of short hairpins just inside the boundaries of exons. The density of predicted intronic RNAs increases with the length of introns, and the density of predicted secondary structures within mRNA coding regions increases with the number of introns in a gene

    Publisher Correction: Demonstration of reduced neoclassical energy transport in Wendelstein 7-X

    Get PDF

    Demonstration of reduced neoclassical energy transport in Wendelstein 7-X

    Get PDF

    Experimental confirmation of efficient island divertor operation and successful neoclassical transport optimization in Wendelstein 7-X

    Get PDF

    Forward modeling of collective Thomson scattering for Wendelstein 7-X plasmas: Electrostatic approximation

    Get PDF
    In this paper, we present a method for numerical computation of collective Thomson scattering (CTS). We developed a forward model, eCTS, in the electrostatic approximation and benchmarked it against a full electromagnetic model. Differences between the electrostatic and the electromagnetic models are discussed. The sensitivity of the results to the ion temperature and the plasma composition is demonstrated. We integrated the model into the Bayesian data analysis framework Minerva and used it for the analysis of noisy synthetic data sets produced by a full electromagnetic model. It is shown that eCTS can be used for the inference of the bulk ion temperature. The model has been used to infer the bulk ion temperature from the first CTS measurements on Wendelstein 7-X

    Towards a new image processing system at Wendelstein 7-X: From spatial calibration to characterization of thermal events

    Get PDF
    Wendelstein 7-X (W7-X) is the most advanced fusion experiment in the stellarator line and is aimed at proving that the stellarator concept is suitable for a fusion reactor. One of the most important issues for fusion reactors is the monitoring of plasma facing components when exposed to very high heat loads, through the use of visible and infrared (IR) cameras. In this paper, a new image processing system for the analysis of the strike lines on the inboard limiters from the first W7-X experimental campaign is presented. This system builds a model of the IR cameras through the use of spatial calibration techniques, helping to characterize the strike lines by using the information given by real spatial coordinates of each pixel. The characterization of the strike lines is made in terms of position, size, and shape, after projecting the camera image in a 2D grid which tries to preserve the curvilinear surface distances between points. The description of the strike-line shape is made by means of the Fourier Descriptors

    Experimental confirmation of efficient island divertor operation and successful neoclassical transport optimization in Wendelstein 7-X

    Get PDF
    We present recent highlights from the most recent operation phases of Wendelstein 7-X, the most advanced stellarator in the world. Stable detachment with good particle exhaust, low impurity content, and energy confinement times exceeding 100 ms, have been maintained for tens of seconds. Pellet fueling allows for plasma phases with reduced ion-temperature-gradient turbulence, and during such phases, the overall confinement is so good (energy confinement times often exceeding 200 ms) that the attained density and temperature profiles would not have been possible in less optimized devices, since they would have had neoclassical transport losses exceeding the heating applied in W7-X. This provides proof that the reduction of neoclassical transport through magnetic field optimization is successful. W7-X plasmas generally show good impurity screening and high plasma purity, but there is evidence of longer impurity confinement times during turbulence-suppressed phases.EC/H2020/633053/EU/Implementation of activities described in the Roadmap to Fusion during Horizon 2020 through a Joint programme of the members of the EUROfusion consortium/ EUROfusio
    • …
    corecore