Search CORE

78 research outputs found

Automated alignment of RNA sequences to pseudoknotted structures

Author: Gary D Stormo
Jack E Tabaska
Publication venue: AAAI Press
Publication date: 01/01/1997
Field of study

Abstract Seq7 is a new program for generating multiple structure-based alignments of RNA sequences. By using a variant of Dijkstra's algorithm to find the shortest path through a specially constructed graph, Seq7 is able to align RNA sequences to pseudoknotted structures in polynomial time. In this paper, we describe the operation of Seq7 and demonstrate the program's abilities. We also describe the use of Sex/7 in an Expectation-Maximization procedure that automates the process of structural modeling and alignment of RNA sequences

CiteSeerX

High sensitivity RNA pseudoknot prediction

Author: Aalberts
Batenburg
Cao
Clote
Cornish
Dam
Dirks
Gesteland
Gluick
Gultyaev
Gutell
Hesham Ali
Hofacker
Huang
Kim
Liphardt
Needleman
Plant
Plant
Reeder
Ren
Rivas
Ruan
Su
Tabaska
Tinoco
Tuerk
Wang
Xia
Xiaolu Huang
Yingling
Zuker
Zuker
Publication venue: Oxford University Press
Publication date: 19/12/2006
Field of study

Most ab initio pseudoknot predicting methods provide very few folding scenarios for a given RNA sequence and have low sensitivities. RNA researchers, in many cases, would rather sacrifice the specificity for a much higher sensitivity for pseudoknot detection. In this study, we introduce the Pseudoknot Local Motif Model and Dynamic Partner Sequence Stacking (PLMM_DPSS) algorithm which predicts all PLM model pseudoknots within an RNA sequence in a neighboring-region-interference-free fashion. The PLM model is derived from the existing Pseudobase entries. The innovative DPSS approach calculates the optimally lowest stacking energy between two partner sequences. Combined with the Mfold, PLMM_DPSS can also be used in predicting complicated pseudoknots. The test results of PLMM_DPSS, PKNOTS, iterated loop matching, pknotsRG and HotKnots with Pseudobase sequences have shown that PLMM_DPSS is the most sensitive among the five methods. PLMM_DPSS also provides manageable pseudoknot folding scenarios for further structure determination

CiteSeerX

Crossref

PubMed Central

The University of Nebraska, Omaha

Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots

Author: C. Alkan
C. Liu
C. Theis
C.M. Reidys
E. Bindewald
E. Rivas
H. Yang
J. Zhao
J.E. Tabaska
M. Jiang
M.R. Garey
M.V. Ashley
R. Nussinov
R.B. Lyngsø
R.B. Lyngsø
S. Griffiths-Jones
S. Ieong
T. Akutsu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International audiencePredicting the folding of an RNA sequence, while allowing general pseudoknots (PK), consists in finding a minimal free-energy matching of its

n

positions. Assuming independently contributing base-pairs, the problem can be solved in

\Theta(n^3)

-time using a variant of the maximal weighted matching. By contrast, the problem was previously proven NP-Hard in the more realistic nearest-neighbor energy model. In this work, we consider an intermediate model, called the stacking-pairs energy model. We extend a result by Lyngs\o, showing that RNA folding with PK is NP-Hard within a large class of parametrization for the model. We also show the approximability of the problem, by giving a practical

\Theta(n^3)

algorithm that achieves at least a

5

-approximation for any parametrization of the stacking model. This contrasts nicely with the nearest-neighbor version of the problem, which we prove cannot be approximated within any positive ratio, unless

P=NP

.La prédiction du repliement, avec pseudonoeuds généraux, d'une séquence d'ARN de taille

n

est équivalent à la recherche d'un couplage d'énergie libre minimale. Dans un modèle d'énergie simple, où chaque paire de base contribue indépendamment à l'énergie, ce problème peut être résolu en temps

\Theta(n^3)

grâce à une variante d'un algorithme de couplage pondéré maximal. Cependant, le même problème a été démontré NP-difficile dans le modèle d'énergie dit des plus proches voisins. Dans ce travail, nous étudions les propriétés du problème sous un modèle d'empilements, constituant un modèle intermédiaire entre ceux d'appariement et des plus proches voisins. Nous démontrons tout d'abord que le repliement avec pseudo-noeuds de l'ARN reste NP-difficile dans de nombreuses valuations du modèle d'énergie. . Par ailleurs, nous montrons que ce problème est approximable, en proposant un algorithme polynomial garantissant une

1/5

-approximation. Ce résultat illustre une différence essentielle entre ce modèle et celui des plus proches voisins, pour lequel nous montrons qu'il ne peut être approché à aucun ratio positif par un algorithme en temps polynomial sauf si

N=NP

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Polytechnique

Sequence determinants in human polyadenylation site selection

Author: A Moreira
C Burge
D Gautheret
D Zarkower
DF Colgan
E Beaudoing
E Beaudoing
F Chen
G Edwalds-Gilbert
G Pesole
J Zhao
JE Tabaska
N Proudfoot
RV Davuluri
S Brackenridge
Y Aissouni
ZF Chou
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: Differential polyadenylation is a widespread mechanism in higher eukaryotes producing mRNAs with different 3' ends in different contexts. This involves several alternative polyadenylation sites in the 3' UTR, each with its specific strength. Here, we analyze the vicinity of human polyadenylation signals in search of patterns that would help discriminate strong and weak polyadenylation sites, or true sites from randomly occurring signals. RESULTS: We used human genomic sequences to retrieve the region downstream of polyadenylation signals, usually absent from cDNA or mRNA databases. Analyzing 4956 EST-validated polyadenylation sites and their -300/+300 nt flanking regions, we clearly visualized the upstream (USE) and downstream (DSE) sequence elements, both characterized by U-rich (not GU-rich) segments. The presence of a USE and a DSE is the main feature distinguishing true polyadenylation sites from randomly occurring A(A/U)UAAA hexamers. While USEs are indifferently associated with strong and weak poly(A) sites, DSEs are more conspicuous near strong poly(A) sites. We then used the region encompassing the hexamer and DSE as a training set for poly(A) site identification by the ERPIN program and achieved a prediction specificity of 69 to 85% for a sensitivity of 56%. CONCLUSION: The availability of complete genomes and large EST sequence databases now permit large-scale observation of polyadenylation sites. Both U-rich sequences flanking both sides of poly(A) signals contribute to the definition of "true" sites. However, the downstream U-rich sequences may also play an enhancing role. Based on this information, poly(A) site prediction accuracy was moderately but consistently improved compared to the best previously available algorithm

Crossref

HAL AMU

Springer - Publisher Connector

Directory of Open Access Journals

HAL-Inserm

PubMed Central

RNAG: a new Gibbs sampler for predicting RNA secondary structure for unaligned sequences

Author: Bernhart
Bindewald
Carvalho
Cary
Charles E. Lawrence
Chenna
Ding
Ding
Do
Do
Do
Donglai Wei
Eddy
Gardner
Geman
Giegerich
Griffiths-Jones
Gutell
Hamada
Hamada
Hofacker
Hofacker
Ji
Kiryu
Kiryu
Knudsen
Lauren V. Alpert
Lindgreen
Liu
Mathews
Mathews
Meyer
Nawrocki
Nawrocki
Newberg
Sakakibara
Sankoff
Seemann
Siebert
Steffen
Tabaska
Torarinsson
Webb
Webb-Robertson
Will
Xing
Yao
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Motivation: RNA secondary structure plays an important role in the function of many RNAs, and structural features are often key to their interaction with other cellular components. Thus, there has been considerable interest in the prediction of secondary structures for RNA families. In this article, we present a new global structural alignment algorithm, RNAG, to predict consensus secondary structures for unaligned sequences. It uses a blocked Gibbs sampling algorithm, which has a theoretical advantage in convergence time. This algorithm iteratively samples from the conditional probability distributions P(Structure | Alignment) and P(Alignment | Structure). Not surprisingly, there is considerable uncertainly in the high-dimensional space of this difficult problem, which has so far received limited attention in this field. We show how the samples drawn from this algorithm can be used to more fully characterize the posterior space and to assess the uncertainty of predictions

CiteSeerX

Crossref

PubMed Central

Thermodynamics of RNA structures by Wang–Landau sampling

Author: Abrashams
Bekaert
Bernhart
Bradley
B ck
Cheah
Chen
Clote
Danilova
Dimitrov
Dirks
Eddy
F. Lou
Flamm
Flamm
Griffiths-Jones
Hofacker
Kirkpatrick
Knudsen
Kou
Lim
Lyngs
Mandal
Markham
Metzler
Nussinov
Omer
Ortiz
P. Clote
Reeder
Reinisch
REN
Rivas
Tabaska
Tucker
van Batenburg
Wang
Weinger
Wuchty
Xayaphoummine
Zhang
Zhao
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Thermodynamics-based dynamic programming RNA secondary structure algorithms have been of immense importance in molecular biology, where applications range from the detection of novel selenoproteins using expressed sequence tag (EST) data, to the determination of microRNA genes and their targets. Dynamic programming algorithms have been developed to compute the minimum free energy secondary structure and partition function of a given RNA sequence, the minimum free-energy and partition function for the hybridization of two RNA molecules, etc. However, the applicability of dynamic programming methods depends on disallowing certain types of interactions (pseudoknots, zig-zags, etc.), as their inclusion renders structure prediction an nondeterministic polynomial time (NP)-complete problem. Nevertheless, such interactions have been observed in X-ray structures

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL-Polytechnique

HAL-Rennes 1

MicroRNA-mediated up-regulation of an alternatively polyadenylated variant of the mouse cytoplasmic β-actin gene

Author: Bachvarova
Beaudoing
Beaudoing
Beena Pillai
Bettinger
Bommer
Chaitali Bhattacharjee
Chang
Cheadle
Colgan
Curtis
Eckner
Edmonds
Edwalds-Gilbert
Eom
Ford
Freilich
Gilmartin
Griffiths-Jones
Grummt
Hermeking
Hofmann
Hu
Huang
Jackson
Kartik Soni
Keller
Kislauskis
Kye
Legendre
Lin
Mahantappa Halimani
Manley
Miralles
Paynton
Ponte
Preiss
Proudfoot
Proudfoot
Raver-Shapira
Ross
Sachs
Sheets
Tabaska
Tanay Ghosh
Tarasov
Tazawa
Tian
Tiruchinapalli
Vasudevan
Vinod Scaria
Wahle
Wahle
Yan
Zhang
Zhao
Publication venue: Oxford University Press
Publication date
Field of study

Actin is a major cytoskeletal protein in eukaryotes. Recent studies suggest more diverse functional roles for this protein. Actin mRNA is known to be localized to neuronal synapses and undergoes rapid deadenylation during early developmental stages. However, its 3′-untranslated region (UTR) is not characterized and there are no experimentally determined polyadenylation (polyA) sites in actin mRNA. We have found that the cytoplasmic β-actin (Actb) gene generates two alternative transcripts terminated at tandem polyA sites. We used 3′-RACE, EST end analysis and in situ hybridization to unambiguously establish the existence of two 3′-UTRs of varying length in Actb transcript in mouse neuronal cells. Further analyses showed that these two tandem polyA sites are used in a tissue-specific manner. Although the longer 3′-UTR was expressed at a relatively lower level, it conferred higher translational efficiency to the transcript. The longer transcript harbours a conserved mmu-miR-34a/34b-5p target site. Sequence-specific anti-miRNA molecule, mutations of the miRNA target region in the 3′-UTR resulted in reduced expression. The expression was restored by a mutant miRNA complementary to the mutated target region implying that miR-34 binding to Actb 3′-UTR up-regulates target gene expression. Heterogeneity of the Actb 3′-UTR could shed light on the mechanism of miRNA-mediated regulation of messages in neuronal cells

Crossref

PubMed Central

Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

Author: AV Uzilov
B Gulko
B Knudsen
B Knudsen
B Morgenstern
D Sankoff
DH Mathews
DH Mathews
DH Mathews
DKY Chiu
DS Fields
E Rivas
G Storz
I Holmes
I Holmes
I Holmes
IL Hofacker
IL Hofacker
IL Hofacker
J Gorodkin
J Gorodkin
J Gorodkin
J Reeder
J Wuyts
J Wuyts
JE Hopcroft
JE Tabaska
JH Havgaard
M Zuker
M Zuker
MS Waterman
NR Pace
O Perriquet
PP Gardner
R Durbin
R Giegerich
R Green
R Lück
R Nussinov
RD Dowell
RD Dowell
Robin D Dowell
RR Gutell
RR Gutell
RR Gutell
S Batzoglou
S Griffiths-Jones
Sean R Eddy
SR Eddy
SV Muse
V Juan
VR Akmaev
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Phylogenetic analysis of mRNA polyadenylation sites reveals a role of transposable elements in evolution of the 3′-end of genes

Author: An
Ara
Babushok
Batzer
Beaudoing
Beaudoing
Belancio
Bennett
Bentley
Bin Tian
Buratowski
Chen
Chen
Cheng
Colgan
Danckwardt
Dewannieux
Edmonds
Edwalds-Gilbert
Gilmartin
Giorgio Matassi
Hall-Pogar
Han
Hu
Jacobson
Ju Youn Lee
Kent
Khan
Labuda
Lander
Lee
Lee
Legendre
Mariner
McMahon
Medstrand
Mills
Perepelitsa-Belancio
Peterson
Phillips
Proudfoot
Roy-Engel
Sachs
Salisbury
Sela
Sironi
Smalheiser
Smit
Smit
Sorek
Szak
Tabaska
Tian
Tian
van de Lagemaat
Wang
Wickens
Wicker
Yan
Zhang
Zhang
Zhao
Zhe Ji
Zhu
Publication venue: Oxford University Press
Publication date
Field of study

mRNA polyadenylation is an essential step for the maturation of almost all eukaryotic mRNAs, and is tightly coupled with termination of transcription in defining the 3′-end of genes. Large numbers of human and mouse genes harbor alternative polyadenylation sites [poly(A) sites] that lead to mRNA variants containing different 3′-untranslated regions (UTRs) and/or encoding distinct protein sequences. Here, we examined the conservation and divergence of different types of alternative poly(A) sites across human, mouse, rat and chicken. We found that the 3′-most poly(A) sites tend to be more conserved than upstream ones, whereas poly(A) sites located upstream of the 3′-most exon, also termed intronic poly(A) sites, tend to be much less conserved. Genes with longer evolutionary history are more likely to have alternative polyadenylation, suggesting gain of poly(A) sites through evolution. We also found that nonconserved poly(A) sites are associated with transposable elements (TEs) to a much greater extent than conserved ones, albeit less frequently utilized. Different classes of TEs have different characteristics in their association with poly(A) sites via exaptation of TE sequences into polyadenylation elements. Our results establish a conservation pattern for alternative poly(A) sites in several vertebrate species, and indicate that the 3′-end of genes can be dynamically modified by TEs through evolution

Crossref

PubMed Central

Why Does the Giant Panda Eat Bamboo? A Comparative Analysis of Appetite-Reward-Related Genes among Mammals

Author: A Tanzer
BP Lewis
BP Lewis
C Jin
Chenyi Xue
CT Lee
D Smedley
DP Bartel
E Mikolajczyk
EC Pooley
ES Dierenfeld
GB Schaller
GQ Zhao
H Endo
H Endo
H Frieling
H Zhao
HR Berthoud
J Chandrashekar
J Haavik
J Vidgren
JA Roth
JE Tabaska
Jinyi Qian
JL Gittleman
K Katoh
K Rutherford
KC Miranda
Ke Jin
KM Tuohy
M Huotari
M Kozak
M Kozak
M Kozak
M Rask-Andersen
M Zuker
M. James C. Crabbe
Masami Hasegawa
MJ Salesa
N Eswar
NR Lenard
PT Mannisto
R Li
RA Wise
RC Friedman
RD Palmiter
S Fulton
S Griffiths-Jones
S Haider
SB Flagel
Takahiro Yonezawa
Vincent Laudet
Xiaoli Wu
Yang Zhong
Ying Cao
Yong Zhu
Yufang Zheng
Zhen Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda’s dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda’s diet switch. Methodology/Principal Findings: Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Conclusions/Significance: Our results revealed an interesting dopamine metabolic involvement in the panda’s food choice

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Bedfordshire Repository