Search CORE

149 research outputs found

Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix

Author: David Mathews
Elfar Torarinsson
Jakob H Havgaard
Jan Gorodkin
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

It has become clear that noncoding RNAs (ncRNA) play important roles in cells, and emerging studies indicate that there might be a large number of unknown ncRNAs in mammalian genomes. There exist computational methods that can be used to search for ncRNAs by comparing sequences from different genomes. One main problem with these methods is their computational complexity, and heuristics are therefore employed. Two heuristics are currently very popular: pre-folding and pre-aligning. However, these heuristics are not ideal, as pre-aligning is dependent on sequence similarity that may not be present and pre-folding ignores the comparative information. Here, pruning of the dynamical programming matrix is presented as an alternative novel heuristic constraint. All subalignments that do not exceed a length-dependent minimum score are discarded as the matrix is filled out, thus giving the advantage of providing the constraints dynamically. This has been included in a new implementation of the FOLDALIGN algorithm for pairwise local or global structural alignment of RNA sequences. It is shown that time and memory requirements are dramatically lowered while overall performance is maintained. Furthermore, a new divide and conquer method is introduced to limit the memory requirement during global alignment and backtrack of local alignment. All branch points in the computed RNA structure are found and used to divide the structure into smaller unbranched segments. Each segment is then realigned and backtracked in a normal fashion. Finally, the FOLDALIGN algorithm has also been updated with a better memory implementation and an improved energy model. With these improvements in the algorithm, the FOLDALIGN software package provides the molecular biologist with an efficient and user-friendly tool for searching for new ncRNAs. The software package is available for download at http://foldalign.ku.dk

Crossref

Directory of Open Access Journals

PubMed Central

Copenhagen University Research Information System

WAR: Webserver for aligning structural RNAs

Author: Cornish-Bowden
E. Torarinsson
Hofacker
Hofacker
Ioachimescu
McCaskill
Notredame
Pedersen
S. Lindgreen
Thompson
Thompson
Washietl
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

We present an easy-to-use webserver that makes it possible to simultaneously use a number of state of the art methods for performing multiple alignment and secondary structure prediction for noncoding RNA sequences. This makes it possible to use the programs without having to download the code and get the programs to run. The results of all the programs are presented on a webpage and can easily be downloaded for further analysis. Additional measures are calculated for each program to make it easier to judge the individual predictions, and a consensus prediction taking all the programs into account is also calculated. This website is free and open to all users and there is no login requirement. The webserver can be found at: http://genome.ku.dk/resources/war

CiteSeerX

Crossref

PubMed Central

Copenhagen University Research Information System

Statistical evaluation of improvement in RNA secondary structure prediction

Author: Anthony Almudevar
Cui
Dalgaard
David H. Mathews
Do
Doudna
Gerstman
Glantz
Gorodkin
Gosling
Hofacker
Holmes
Jambhekar
Kiryu
Lindgreen
Long
Mathews
Mathews
Mathews
Mathews
Matthews
Nissen
Pace
R Development Core Team
Reuter
Sarkar
Shao
Shapiro
Siegmund
Sprinzl
Steffen
Szymanski
Tafer
Torarinsson
Torarinsson
Uzilov
Wald
Washietl
Will
Winkler
Xu
Xu
Yanofsky
Zadeh
Zhenjiang Xu
Znosko
Publication venue: Oxford University Press
Publication date
Field of study

With discovery of diverse roles for RNA, its centrality in cellular functions has become increasingly apparent. A number of algorithms have been developed to predict RNA secondary structure. Their performance has been benchmarked by comparing structure predictions to reference secondary structures. Generally, algorithms are compared against each other and one is selected as best without statistical testing to determine whether the improvement is significant. In this work, it is demonstrated that the prediction accuracies of methods correlate with each other over sets of sequences. One possible reason for this correlation is that many algorithms use the same underlying principles. A set of benchmarks published previously for programs that predict a structure common to three or more sequences is statistically analyzed as an example to show that it can be rigorously evaluated using paired two-sample t-tests. Finally, a pipeline of statistical analyses is proposed to guide the choice of data set size and performance assessment for benchmarks of structure prediction. The pipeline is applied using 5S rRNA sequences as an example

Crossref

PubMed Central

Structural profiles of human miRNA families from pairwise clustering

Author: Gorodkin Jan
Havgaard Jakob Hull
Kaczkowski Bogumił
Reiche Kristin
Stadler Peter F.
Torarinsson Elfar
Publication venue
Publication date: 06/11/2018
Field of study

MicroRNAs (miRNAs) are a group of small, ∼21 nt long, riboreg-ulators inhibiting gene expression at a post-transcriptional level. Their most distinctive structural feature is the foldback hairpin of their precursor pre-miRNAs. Even though each pre-miRNA deposited in miRBase has its secondary structure already predicted, little is known about the patterns of structural conservation among pre-miRNAs. We address this issue by clustering the human pre-miRNA sequences based on pairwise, sequence and secondary structure alignment using FOLDALIGN, followed by global multiple alignment of obtained clusters by WAR. As a result, the common secondary structure was successfully determined for four FOLDALIGN clusters: the RF00027 structural family of the Rfam database and three clusters with previously undescribed consensus structures

Qucosa - Publikationsserver der Universität Leipzig

Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrroly-sine containing genes

Author: B Chaudhuri
B Knudsen
C Notredame
Christian Theil Have
DG Longstaff
E Torarinsson
E Torarinsson
EP Nawrocki
GV Kryukov
Henning Christiansen
I Hofacker
IL Hofacker
IL Hofacker
IU Heinemann
J Atkins
J Reeder
JA Krzycki
JD Thompson
JD Thompson
K Katoh
M Bauer
M Fujita
M Höchsmann
MA Gaston
MA Gaston
N Wirth
S Bernhart
S Lindgreen
S Mørk
S Will
SE Seemann
SF Altschul
Sine Zambach
T Abe
TM Martin Simonsen
X Xu
Y Zhang
Z Yao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

BACKGROUND: Pyrrolysine (the 22nd amino acid) is in certain organisms and under certain circumstances encoded by the amber stop codon, UAG. The circumstances driving pyrrolysine translation are not well understood. The involvement of a predicted mRNA structure in the region downstream UAG has been suggested, but the structure does not seem to be present in all pyrrolysine incorporating genes. RESULTS: We propose a strategy to predict pyrrolysine encoding genes in genomes of archaea and bacteria. We cluster open reading frames interrupted by the amber codon based on sequence similarity. We rank these clusters according to several features that may influence pyrrolysine translation. The ranking effects of different features are assessed and we propose a weighted combination of these features which best explains the currently known pyrrolysine incorporating genes. We devote special attention to the effect of structural conservation and provide further substantiation to support that structural conservation may be influential – but is not a necessary factor. Finally, from the weighted ranking, we identify a number of potentially pyrrolysine incorporating genes. CONCLUSIONS: We propose a method for prediction of pyrrolysine incorporating genes in genomes of bacteria and archaea leading to insights about the factors driving pyrrolysine translation and identification of new gene candidates. The method predicts known conserved genes with high recall and predicts several other promising candidates for experimental verification. The method is implemented as a computational pipeline which is available on request

Crossref

Roskilde Universitet

Springer - Publisher Connector

PubMed Central

An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast

Author: A Harmanci
A Harmanci
A Taneda
A Uzilov
A Uzilov
A Wilm
Akito Taneda
B Knudsen
C Lu
C Notredame
C Notredame
C Selig
CC Chang
CMA Davis Jr
D Dalli
D Dalli
D Rose
D Sankoff
DE Goldberg
E Rivas
E Rivas
E Torarinsson
E Torarinsson
E Torarinsson
F Miura
G Gonsalvez
H Kiryu
H Kiryu
I Hofacker
I Hofacker
I Holmes
J Cherry
J Gorodkin
J Havgaard
J Havgaard
J Pedersen
J Schultz
J Thompson
J Thompson
K Katoh
K Missal
K Missal
L David
M Bauer
M Gerstein
M Samanta
P Carninci
R Dowell
R Klein
R Nussinov
S Needleman
S Washietl
S Washietl
S Washietl
S Will
W Gish
X Xu
Y Tabei
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Aligning RNA sequences with low sequence identity has been a challenging problem since such a computation essentially needs an algorithm with high complexities for taking structural conservation into account. Although many sophisticated algorithms for the purpose have been proposed to date, further improvement in efficiency is necessary to accelerate its large-scale applications including non-coding RNA (ncRNA) discovery. Results We developed a new genetic algorithm, Cofolga2, for simultaneously computing pairwise RNA sequence alignment and consensus folding, and benchmarked it using BRAliBase 2.1. The benchmark results showed that our new algorithm is accurate and efficient in both time and memory usage. Then, combining with the originally trained SVM, we applied the new algorithm to novel ncRNA discovery where we compared <it>S. cerevisiae </it>genome with six related genomes in a pairwise manner. By focusing our search to the relatively short regions (50 bp to 2,000 bp) sandwiched by conserved sequences, we successfully predict 714 intergenic and 1,311 sense or antisense ncRNA candidates, which were found in the pairwise alignments with stable consensus secondary structure and low sequence identity (≤ 50%). By comparing with the previous predictions, we found that > 92% of the candidates is novel candidates. The estimated rate of false positives in the predicted candidates is 51%. Twenty-five percent of the intergenic candidates has supports for expression in cell, i.e. their genomic positions overlap those of the experimentally determined transcripts in literature. By manual inspection of the results, moreover, we obtained four multiple alignments with low sequence identity which reveal consensus structures shared by three species/sequences. Conclusion The present method gives an efficient tool complementary to sequence-alignment-based ncRNA finders.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ProTISA: a comprehensive resource for translation initiation site annotation in prokaryotic genomes

Author: Aivaliotis
Bailey
Besemer
Chang
Crooks
Frishman
G.-Q. Hu
H. Zhu
Hershberg
Londei
Ma
Moll
Ou
P. Ortet
Poole
Rudd
Sazuka
Starmer
Strohl
Suzek
Torarinsson
Wu
X. Zheng
Y.-F. Yang
Z.-S. She
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Correct annotation of translation initiation site (TIS) is essential for both experiments and bioinformatics studies of prokaryotic translation initiation mechanism as well as understanding of gene regulation and gene structure. Here we describe a comprehensive database ProTISA, which collects TIS confirmed through a variety of available evidences for prokaryotic genomes, including Swiss-Prot experiments record, literature, conserved domain hits and sequence alignment between orthologous genes. Moreover, by combining the predictions from our recently developed TIS post-processor, ProTISA provides a refined annotation for the public database RefSeq. Furthermore, the database annotates the potential regulatory signals associated with translation initiation at the TIS upstream region. As of July 2007, ProTISA includes 440 microbial genomes with more than 390 000 confirmed TISs. The database is available at http://mech.ctb.pku.edu.cn/protis

Crossref

PubMed Central

The identification and functional annotation of RNA structures conserved in vertebrates

Author: Bang-Berthelsen Claus H
Christensen-Dalsgaard Mikkel
Garde Christian
Gorodkin Jan
Hansen Claus
Mirza Aashiq Hussain
Nielsen Henrik
Pociot Flemming
Ruzzo Walter L.
Seemann Ernst Stefan
Tommerup Niels
Torarinsson Elfar
Workman Christopher T
Yao Zizhen
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2017
Field of study

Copenhagen University Research Information System

Online Research Database In Technology

PETcofold: predicting conserved interactions and structures of two multiple alignments of RNA sequences

Author: Alkan
Altschul
Andreas S. Richter
Andronescu
Argaman
Bachellerie
Backofen
Bernhart
Bompfünewerer
Brunel
Busch
Byun
Chitsaz
Chitsaz
Dirks
Felsenstein
Gardner
Gardner
Gaspin
Geissmann
Gesell
Gorodkin
Gorodkin
Hertel
Hofacker
Horler
Huang
Huang
Hüttenhofer
Jan Gorodkin
Kato
Katoh
Knudsen
Knudsen
Kolbe
Lestrade
Li
Matthews
Menzel
Mercer
Mückstein
Mückstein
Pervouchine
Ravasi
Rehmsmeier
Richter
Rolf Backofen
Salari
Salari
Seemann
Seemann
Sharma
Stefan E. Seemann
Tafer
Taft
Tanja Gesell
The ENCODE Project Consortium
Torarinsson
Torarinsson
Tycowski
Udekwu
Večerek
Vinh
Vitali
Washietl
Washietl
Waterhouse
Waters
Watson
Weinberg
Will
Wilusz
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Motivation: Predicting RNA–RNA interactions is essential for determining the function of putative non-coding RNAs. Existing methods for the prediction of interactions are all based on single sequences. Since comparative methods have already been useful in RNA structure determination, we assume that conserved RNA–RNA interactions also imply conserved function. Of these, we further assume that a non-negligible amount of the existing RNA–RNA interactions have also acquired compensating base changes throughout evolution. We implement a method, PETcofold, that can take covariance information in intra-molecular and inter-molecular base pairs into account to predict interactions and secondary structures of two multiple alignments of RNA sequences

CiteSeerX

Crossref

PubMed Central

Copenhagen University Research Information System

Lightweight comparison of RNAs based on exact sequence–structure matches

Author: Allali
Altschul
Backofen
Bafna
Bahr
Bauer
Blin
Cannone
Evans
Gardner
Griffiths-Jones
Havgaard
Hentze
Hofacker
Hofacker
Huttenhofer
Höchsmann
Jiang
Jiang
Lin
Martineau
Mathews
Mathews
Michael Beckstette
Otto
Rolf Backofen
Sankoff
Sebastian Will
Serganov
Steffen Heyne
Torarinsson
Will
Wilm
Wilting
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Specific functions of ribonucleic acid (RNA) molecules are often associated with different motifs in the RNA structure. The key feature that forms such an RNA motif is the combination of sequence and structure properties. In this article, we introduce a new RNA sequence–structure comparison method which maintains exact matching substructures. Existing common substructures are treated as whole unit while variability is allowed between such structural motifs

CiteSeerX

Crossref

PubMed Central

Publications at Bielefeld University