Search CORE

57 research outputs found

Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments

Author: Jan Gorodkin
Jan Gorodkin
Rolf Backofen
Rolf Backofen
Rss Alerting
Stefan E. Seemann
Stefan E. Seemann
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Computational methods for determining the secondary structure of RNA sequences from given alignments are currently either based on thermodynamic folding, compensatory base pair substitutions or both. However, there is currently no approach that combines both sources of information in a single optimization problem. Here, we present a model that formally integrates both the energy-based and evolution-based approaches to predict the folding of multiple aligned RNA sequences. We have implemented an extended version of Pfold that identifies base pairs that have high probabilities of being conserved and of being energetically favorable. The consensus structure is predicted using a maximum expected accuracy scoring scheme to smoothen the effect of incorrectly predicted base pairs. Parameter tuning revealed that the probability of base pairing has a higher impact on the RNA structure prediction than the corresponding probability of being single stranded. Furthermore, we found that structurally conserved RNA motifs are mostly supported by folding energies. Other problems (e.g. RNA-folding kinetics) may also benefit from employing the principles of the model we introduce. Our implementation, PETfold, was tested on a set of 46 well-curated Rfam families and its performance compared favorably to that of Pfold and RNAalifold

CiteSeerX

PubMed Central

Copenhagen University Research Information System

Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures

Author: Anthon Christian
Gorodkin Jan
Sabarinathan Radhakrishnan
Seemann Stefan E
Publication venue: 'MDPI AG'
Publication date: 01/12/2018
Field of study

Self-contained structured domains of RNA sequences have often distinct molecular functions. Determining the boundaries of structured domains of a non-coding RNA (ncRNA) is needed for many ncRNA gene finder programs that predict RNA secondary structures in aligned genomes because these methods do not necessarily provide precise information about the boundaries or the location of the RNA structure inside the predicted ncRNA. Even without having a structure prediction, it is of interest to search for structured domains, such as for finding common RNA motifs in RNA-protein binding assays. The precise definition of the boundaries are essential for downstream analyses such as RNA structure modelling, e.g., through covariance models, and RNA structure clustering for the search of common motifs. Such efforts have so far been focused on single sequences, thus here we present a comparison for boundary definition between single sequence and multiple sequence alignments. We also present a novel approach, named RNAbound, for finding the boundaries that are based on probabilities of evolutionarily conserved base pairings. We tested the performance of two different methods on a limited number of Rfam families using the annotated structured RNA regions in the human genome and their multiple sequence alignments created from 14 species. The results show that multiple sequence alignments improve the boundary prediction for branched structures compared to single sequences independent of the chosen method. The actual performance of the two methods differs on single hairpin structures and branched structures. For the RNA families with branched structures, including transfer RNA (tRNA) and small nucleolar RNAs (snoRNAs), RNAbound improves the boundary predictions using multiple sequence alignments to median differences of −6 and −11.5 nucleotides (nts) for left and right boundary, respectively (window size of 200 nts)

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Copenhagen University Research Information System

DotAligner:Identification and clustering of RNA structure motifs

Author: Mattick John S.
Quek Xiu Cheng
Seemann Stefan E.
Smith Martin A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2017
Field of study

Abstract The diversity of processed transcripts in eukaryotic genomes poses a challenge for the classification of their biological functions. Sparse sequence conservation in non-coding sequences and the unreliable nature of RNA structure predictions further exacerbate this conundrum. Here, we describe a computational method, DotAligner, for the unsupervised discovery and classification of homologous RNA structure motifs from a set of sequences of interest. Our approach outperforms comparable algorithms at clustering known RNA structure families, both in speed and accuracy. It identifies clusters of known and novel structure motifs from ENCODE immunoprecipitation data for 44 RNA-binding proteins

Directory of Open Access Journals

Copenhagen University Research Information System

Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

Author: Jan Gorodkin
Michael J Hawrylycz
Stefan E Seemann
Susan M Sunkin
Walter L Ruzzo
Publication venue: Springer Nature
Publication date: 01/01/2012
Field of study

BACKGROUND: Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. RESULTS: By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. CONCLUSIONS: Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function

Springer - Publisher Connector

PubMed Central

RNAscClust:Clustering RNA sequences using structure conservation and graph based motifs

Author: Backofen Rolf
Costa Fabrizio
Gorodkin Jan
Havgaard Jakob Hull
Junge Alexander
Miladi Milad
Seemann Stefan E.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Copenhagen University Research Information System

PETcofold: predicting conserved interactions and structures of two multiple alignments of RNA sequences

Author: Alkan
Altschul
Andreas S. Richter
Andronescu
Argaman
Bachellerie
Backofen
Bernhart
Bompfünewerer
Brunel
Busch
Byun
Chitsaz
Chitsaz
Dirks
Felsenstein
Gardner
Gardner
Gaspin
Geissmann
Gesell
Gorodkin
Gorodkin
Hertel
Hofacker
Horler
Huang
Huang
Hüttenhofer
Jan Gorodkin
Kato
Katoh
Knudsen
Knudsen
Kolbe
Lestrade
Li
Matthews
Menzel
Mercer
Mückstein
Mückstein
Pervouchine
Ravasi
Rehmsmeier
Richter
Rolf Backofen
Salari
Salari
Seemann
Seemann
Sharma
Stefan E. Seemann
Tafer
Taft
Tanja Gesell
The ENCODE Project Consortium
Torarinsson
Torarinsson
Tycowski
Udekwu
Večerek
Vinh
Vitali
Washietl
Washietl
Waterhouse
Waters
Watson
Weinberg
Will
Wilusz
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Motivation: Predicting RNA–RNA interactions is essential for determining the function of putative non-coding RNAs. Existing methods for the prediction of interactions are all based on single sequences. Since comparative methods have already been useful in RNA structure determination, we assume that conserved RNA–RNA interactions also imply conserved function. Of these, we further assume that a non-negligible amount of the existing RNA–RNA interactions have also acquired compensating base changes throughout evolution. We implement a method, PETcofold, that can take covariance information in intra-molecular and inter-molecular base pairs into account to predict interactions and secondary structures of two multiple alignments of RNA sequences

CiteSeerX

Crossref

PubMed Central

Copenhagen University Research Information System

Broadening the miRNA Catalogue in Livestock Species

Author: Amaral Andreia J.
Anthon Christian
Arya Anoop
Crooijmans Richard P.M.A.
Gama Luís
Giuffra Elisabetta
Gorodkin Jan
Groenen Martien A.M.
Haack Fiete
Hoffmann Anne
Kantanen Juha
Lagnel Jacques
Madsen Ole
Marthey Sylvain
Palasca Oana
Pokharel Kisun
Seemann Stefan E.
Stadler Peter F.
Publication venue: 'Laser Pages Publishing Ltd.'
Publication date: 01/01/2018
Field of study

201

Jukuri

RNAcentral 2021: secondary structure integration, improved sequence search and new member databases

RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community

Ghent University Academic Bibliography

Copenhagen University Research Information System

RNAcentral 2021: secondary structure integration, improved sequence search and new member databases.

Ghent University Academic Bibliography

Copenhagen University Research Information System

eScholarship - University of California

Apollo (Cambridge)