Search CORE

127 research outputs found

smyRNA: A Novel Ab Initio ncRNA Gene Finder

Author: A Coventry
A Fontaine
C Dieterich
Cagri Aksay
D di Bernardo
DP Bartel
E Bonnet
E Rivas
E Rivas
Emre Karakoc
G Storz
IL Hofacker
IL Hofacker
IM Meyer
IM Meyer
Iman Hajirasouliha
J Thompson
JS Pedersen
M Margulies
Peter J. Unrau
Raheleh Salari
RJ Carter
S Griffiths-Jones
S Washietl
S. Cenk Sahinalp
SR Eddy
SR Eddy
Stefan Maas
Z Yao
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Background: Non-coding RNAs (ncRNAs) have important functional roles in the cell: for example, they regulate gene expression by means of establishing stable joint structures with target mRNAs via complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Although ncRNAs are abundant, discovering novel ncRNAs on genome sequences has proven to be a hard task; in particular past attempts for ab initio ncRNA search mostly failed with the exception of tools that can identify micro RNAs. Methodology/Principal Findings: We present a very general ab initio ncRNA gene finder that exploits differential distributions of sequence motifs between ncRNAs and background genome sequences. Conclusions/Significance: Our method, once trained on a set of ncRNAs from a given species, can be applied to a genome sequences of other organisms to find not only ncRNAs homologous to those in the training set but also others that potentially belong to novel (and perhaps unknown) ncRNA families. Availability

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Simon Fraser University Institutional Repository

Operational approach to open dynamics and quantifying initial correlations

Author: A Kossakowski
A Pomyalov
A Rivas
A Shaji
A Smirne
A-M Kuah
AM Childs
ARU Devi
C Meiera
CA Rodríguez-Rosario
DZ Rossatto
E Geva
E-M Laine
ECG Sudarshan
G Engel
G Lindblad
H Carteret
H-P Breuer
IL Chuang
J Emerson
J Yuen-Zhou
J Yuen-Zhou
JF Poyatos
JL O'Brien
JM Chow
K Modi
K Modi
M Howard
M Neeley
MA Nielsen
MM Wolf
MW Mitchell
SH Myrskog
V Gorini
YS Weinstein
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

A central aim of physics is to describe the dynamics of physical systems. Schrodinger's equation does this for isolated quantum systems. Describing the time evolution of a quantum system that interacts with its environment, in its most general form, has proved to be difficult because the dynamics is dependent on the state of the environment and the correlations with it. For discrete processes, such as quantum gates or chemical reactions, quantum process tomography provides the complete description of the dynamics, provided that the initial states of the system and the environment are independent of each other. However, many physical systems are correlated with the environment at the beginning of the experiment. Here, we give a prescription of quantum process tomography that yields the complete description of the dynamics of the system even when the initial correlations are present. Surprisingly, our method also gives quantitative expressions for the initial correlation.Comment: Completely re-written for clarity of presentation. 15 pages and 2 figure

arXiv.org e-Print Archive

RNA secondary structure prediction from multi-aligned sequences

It has been well accepted that the RNA secondary structures of most functional non-coding RNAs (ncRNAs) are closely related to their functions and are conserved during evolution. Hence, prediction of conserved secondary structures from evolutionarily related sequences is one important task in RNA bioinformatics; the methods are useful not only to further functional analyses of ncRNAs but also to improve the accuracy of secondary structure predictions and to find novel functional RNAs from the genome. In this review, I focus on common secondary structure prediction from a given aligned RNA sequence, in which one secondary structure whose length is equal to that of the input alignment is predicted. I systematically review and classify existing tools and algorithms for the problem, by utilizing the information employed in the tools and by adopting a unified viewpoint based on maximum expected gain (MEG) estimators. I believe that this classification will allow a deeper understanding of each tool and provide users with useful information for selecting tools for common secondary structure predictions.Comment: A preprint of an invited review manuscript that will be published in a chapter of the book `Methods in Molecular Biology'. Note that this version of the manuscript may differ from the published versio

arXiv.org e-Print Archive

CiteSeerX

Crossref

Conserved Secondary Structures in Aspergillus

Author: A Coventry
Abigail Manson McGuire
Alan Christoffels
AV Kochetov
B Ma
B Mazumder
BL Bass
BR Graveley
C Weile
D di Bernardo
D Kampa
D Libri
E Rivas
E Rivas
FP Roth
H Miyaso
IL Hofacker
IL Hofacker
IL Hofacker
IM Meyer
J Cheng
James E. Galagan
JE Galagan
JE Galagan
JM Johnson
JM Kreahling
JP McCutcheon
JS Pedersen
JS Pedersen
K Clyde
K Missal
K Missal
KJ Howe
L He
L Katz
LP Lim
M Brudno
M Dsouza
M Kozak
M Machida
NT Parkin
P Avner
P Bertone
PD Zamore
R Walczak
RC Lee
S Cawley
S Griffiths-Jones
S Griffiths-Jones
S Washietl
S Washietl
S Washietl
S Washietl
S Will
SA Shabalina
SF Altschul
SR Eddy
T Babak
T Imanishi
TM Lowe
W Winkler
WC Nierman
Y Chen
Y Okazaki
Z Yao
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Background: Recent evidence suggests that the number and variety of functional RNAs (ncRNAs as well as cis-acting RNA elements within mRNAs) is much higher than previously thought; thus, the ability to computationally predict and analyze RNAs has taken on new importance. We have computationally studied the secondary structures in an alignment of six Aspergillus genomes. Little is known about the RNAs present in this set of fungi, and this diverse set of genomes has an optimal level of sequence conservation for observing the correlated evolution of base-pairs seen in RNAs. Methodology/Principal Findings: We report the results of a whole-genome search for evolutionarily conserved secondary structures, as well as the results of clustering these predicted secondary structures by structural similarity. We find a total of 7450 predicted secondary structures, including a new predicted,60 bp long hairpin motif found primarily inside introns. We find no evidence for microRNAs. Different types of genomic regions are over-represented in different classes of predicted secondary structures. Exons contain the longest motifs (primarily long, branched hairpins), 59 UTRs primarily contain groupings of short hairpins located near the start codon, and 39 UTRs contain very little secondary structure compared to other regions. There is a large concentration of short hairpins just inside the boundaries of exons. The density of predicted intronic RNAs increases with the length of introns, and the density of predicted secondary structures within mRNA coding regions increases with the number of introns in a gene

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

Author: AV Uzilov
B Gulko
B Knudsen
B Knudsen
B Morgenstern
D Sankoff
DH Mathews
DH Mathews
DH Mathews
DKY Chiu
DS Fields
E Rivas
G Storz
I Holmes
I Holmes
I Holmes
IL Hofacker
IL Hofacker
IL Hofacker
J Gorodkin
J Gorodkin
J Gorodkin
J Reeder
J Wuyts
J Wuyts
JE Hopcroft
JE Tabaska
JH Havgaard
M Zuker
M Zuker
MS Waterman
NR Pace
O Perriquet
PP Gardner
R Durbin
R Giegerich
R Green
R Lück
R Nussinov
RD Dowell
RD Dowell
Robin D Dowell
RR Gutell
RR Gutell
RR Gutell
S Batzoglou
S Griffiths-Jones
Sean R Eddy
SR Eddy
SV Muse
V Juan
VR Akmaev
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Directed acyclic graph kernels for structural RNA analysis

Author: B Knudsen
B Schölkopf
CB Do
D Haussler
D Sankoff
DB Searls
DM Tax
E Rivas
EK Freyhult
H Kiryu
H Saigo
I Holmes
IL Hofacker
IL Hofacker
J Hertel
J Hertel
JD Thompson
JS McCaskill
JS Pedersen
JW Brown
K Sato
Kengo Sato
Kiyoshi Asai
MA Rosenblad
P Pacheco
RD Dowell
RE Fan
RJ Klein
S Washietl
S Washietl
S Will
SR Eddy
SR Eddy
SR Eddy
T Babak
T Kin
Toutai Mituyama
W Deng
Y Sakakibara
Y Sakakibara
Y Sakakibara
Yasubumi Sakakibara
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

De Novo Discovery of Structured ncRNA Motifs in Genomic Sequences

Author: A Marchler-Bauer
A Stamatakis
AF Bompfünewerer
AF Bompfünewerer
BJ Parker
CB Do
D Sankoff
E Rivas
E Torarinsson
E Torarinsson
EE Regulski
ENCODE Project Consortium
EP Nawrocki
G Lunter
HH Tseng
IL Hofacker
IL Hofacker
J Gorodkin
J Gorodkin
JE Barrick
JH Havgaard
JS Mattick
JX Wang
K Missal
M Blanchette
MM Meyer
N Sudarsan
P Anandam
PP Gardner
R Durbin
S Griffiths-Jones
S Griffiths-Jones
S Washietl
S Washietl
S Will
SHF Bernhart
SR Eddy
T Babak
V Gowri-Shankar
WJ Kent
Y Ji
Y Sakakibara
Y Sun
Z Weinberg
Z Weinberg
Z Weinberg
Z Weinberg
Z Weinberg
Z Weinberg
Z Yao
Z Yao
ZJ Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Copenhagen University Research Information System

Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines

An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Evolutionary Modeling and Prediction of Non-Coding RNAs in Drosophila

Author: A Siepel
A Siepel
A Stark
A Varadarajan
AG Clark
Andrew V. Uzilov
B Knudsen
B Paten
CN Dewey
D Rose
D St Johnston
DP Bartel
DS Parker
E Boyle
E Lcuyer
E Nawrocki
E Rivas
E Rivas
E Torarinsson
G McGuire
Ian Holmes
IL Hofacker
J Brennecke
J Pedersen
J Ruby
JL Thorne
JP Bachellerie
JR Manak
JS Pedersen
JS Pedersen
JS Pedersen
KS Pollard
Lars Barquist
M Crosby
M Mandal
M Pheasant
M Sprinzl
Mitchell E. Skinner
N Bray
N Goldman
PD Rijk
PS Klosterman
RD Dowell
RD Dowell
RK Bradley
Robert Belshaw
Robert K. Bradley
S Griffiths-Jones
S Washietl
T Babak
T Elgavish
T Gesell
TM Lowe
V Ambros
WJ Bruno
YR Bendana
Yuri R. Bendaña
Z Wang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

We performed benchmarks of phylogenetic grammar-based ncRNA gene prediction, experimenting with eight different models of structural evolution and two different programs for genome alignment. We evaluated our models using alignments of twelve Drosophila genomes. We find that ncRNA prediction performance can vary greatly between different gene predictors and subfamilies of ncRNA gene. Our estimates for false positive rates are based on simulations which preserve local islands of conservation; using these simulations, we predict a higher rate of false positives than previous computational ncRNA screens have reported. Using one of the tested prediction grammars, we provide an updated set of ncRNA predictions for D. melanogaster and compare them to previously-published predictions and experimental data. Many of our predictions show correlations with protein-coding genes. We found significant depletion of intergenic predictions near the 3′ end of coding regions and furthermore depletion of predictions in the first intron of protein-coding genes. Some of our predictions are colocated with larger putative unannotated genes: for example, 17 of our predictions showing homology to the RFAM family snoR28 appear in a tandem array on the X chromosome; the 4.5 Kbp spanned by the predicted tandem array is contained within a FlyBase-annotated cDNA

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central