Search CORE

5 research outputs found

CONTRAfold: RNA secondary structure prediction without physics-based models

Author: Chuong B. Do
Daniel A. Woods
Serafim Batzoglou
Publication venue
Publication date: 01/01/2006
Field of study

doi:10.1093/bioinformatics/btl24

CiteSeerX

Inverse folding of RNA pseudoknot structures

Author: Gao James Z. M.
Li Linda Y. M.
Reidys Christian M.
Publication venue
Publication date: 01/01/2010
Field of study

Background: RNA exhibits a variety of structural configurations. Here we consider a structure to be tantamount to the noncrossing Watson-Crick and \pairGU-base pairings (secondary structure) and additional cross-serial base pairs. These interactions are called pseudoknots and are observed across the whole spectrum of RNA functionalities. In the context of studying natural RNA structures, searching for new ribozymes and designing artificial RNA, it is of interest to find RNA sequences folding into a specific structure and to analyze their induced neutral networks. Since the established inverse folding algorithms, {\tt RNAinverse}, {\tt RNA-SSD} as well as {\tt INFO-RNA} are limited to RNA secondary structures, we present in this paper the inverse folding algorithm {\tt Inv} which can deal with 3-noncrossing, canonical pseudoknot structures. Results: In this paper we present the inverse folding algorithm {\tt Inv}. We give a detailed analysis of {\tt Inv}, including pseudocodes. We show that {\tt Inv} allows to design in particular 3-noncrossing nonplanar RNA pseudoknot 3-noncrossing RNA structures-a class which is difficult to construct via dynamic programming routines. {\tt Inv} is freely available at \url{http://www.combinatorics.cn/cbpc/inv.html}. Conclusions: The algorithm {\tt Inv} extends inverse folding capabilities to RNA pseudoknot structures. In comparison with {\tt RNAinverse} it uses new ideas, for instance by considering sets of competing structures. As a result, {\tt Inv} is not only able to find novel sequences even for RNA secondary structures, it does so in the context of competing structures that potentially exhibit cross-serial interactions.Comment: 20 pages, 26 figure

arXiv.org e-Print Archive

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Prediction of secondary structures for large RNA molecules

Author: Mathuriya Amrita
Publication venue: Georgia Institute of Technology
Publication date: 12/01/2009
Field of study

The prediction of correct secondary structures of large RNAs is one of the unsolved challenges of computational molecular biology. Among the major obstacles is the fact that accurate calculations scale as O(n⁴), so the computational requirements become prohibitive as the length increases. We present a new parallel multicore and scalable program called GTfold, which is one to two orders of magnitude faster than the de facto standard programs mfold and RNAfold for folding large RNA viral sequences and achieves comparable accuracy of prediction. We analyze the algorithm's concurrency and describe the parallelism for a shared memory environment such as a symmetric multiprocessor or multicore chip. We are seeing a paradigm shift to multicore chips and parallelism must be explicitly addressed to continue gaining performance with each new generation of systems. We provide a rigorous proof of correctness of an optimized algorithm for internal loop calculations called internal loop speedup algorithm (ILSA), which reduces the time complexity of internal loop computations from O(n⁴) to O(n³) and show that the exact algorithms such as ILSA are executed with our method in affordable amount of time. The proof gives insight into solving these kinds of combinatorial problems. We have documented detailed pseudocode of the algorithm for predicting minimum free energy secondary structures which provides a base to implement future algorithmic improvements and improved thermodynamic model in GTfold. GTfold is written in C/C++ and freely available as open source from our website.M.S.Committee Chair: Bader, David; Committee Co-Chair: Heitsch, Christine; Committee Member: Harvey, Stephen; Committee Member: Vuduc, Richar

Scholarly Materials And Research @ Georgia Tech

An energy model that predicts the correct folding of both the tRNA and the 5S RNA molecules.

Author: Gouy M
Ninio J
Papanicolaou C
Publication venue
Publication date: 01/01/1984
Field of study

A new set of energy values to predict the secondary structures in RNA molecules has been derived through a multiple-step refinement procedure. It achieves more than 80% success in predicting the cloverleaf pattern in tRNA (200 sequences tested) and more than 60% success in predicting the consensus folding of 5S RNA (100 sequences). Improvements in our initial program for predicting secondary structures, based on the principle of the "incompatibility islets" made possible the work on 5S RNA. The program was speeded up by introducing a dynamic grouping of the islets into three disjoint blocks. The novel features in the energy model include i) an evaluation of the contribution of odd pairs according to their position within a segment ii) a penalty for internal loops related to their dissymmetry iii) a bonus for bulge loops when the two terminal paired bases at the junction point are both pyrimidines

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL Descartes

Conception et analyse des biopuces à ADN en environnements parallèles et distribués

Author: Jaziri Faouzi
Publication venue: HAL CCSD
Publication date: 23/06/2014
Field of study

Microorganisms represent the largest diversity of the living beings. They play a crucial rôle in all biological processes related to their huge metabolic potentialities and their capacity for adaptation to different ecological niches. The development of new genomic approaches allows a better knowledge of the microbial communities involved in complex environments functioning. In this context, DNA microarrays represent high-throughput tools able to study the presence, or the expression levels of several thousands of genes, combining qualitative and quantitative aspects in only one experiment. However, the design and analysis of DNA microarrays, with their current high density formats as well as the huge amount of data to process, are complex but crucial steps. To improve the quality and performance of these two steps, we have proposed new bioinformatics approaches for the design and analysis of DNA microarrays in parallel and distributed environments. These multipurpose approaches use high performance computing (HPC) and new software engineering approaches, especially model driven engineering (MDE), to overcome the current limitations. We have first developed PhylGrid 2.0, a new distributed approach for the selection of explorative probes for phylogenetic DNA microarrays at large scale using computing grids. This software was used to build PhylOPDb: a comprehensive 16S rRNA oligonucleotide probe database for prokaryotic identification. MetaExploArrays, which is a parallel software of oligonucleotide probe selection on different computing architectures (a PC, a multiprocessor, a cluster or a computing grid) using meta-programming and a model driven engineering approach, has been developed to improve flexibility in accordance to user’s informatics resources. Then, PhylInterpret, a new software for the analysis of hybridization results of DNA microarrays. PhylInterpret uses the concepts of propositional logic to determine the prokaryotic composition of metagenomic samples. Finally, a new parallelization method based on model driven engineering (MDE) has been proposed to compute a complete backtranslation of short peptides to select probes for functional microarrays.Les microorganismes constituent la plus grande diversité du monde vivant. Ils jouent un rôle clef dans tous les processus biologiques grâce à leurs capacités d’adaptation et à la diversité de leurs capacités métaboliques. Le développement de nouvelles approches de génomique permet de mieux explorer les populations microbiennes. Dans ce contexte, les biopuces à ADN représentent un outil à haut débit de choix pour l'étude de plusieurs milliers d’espèces en une seule expérience. Cependant, la conception et l’analyse des biopuces à ADN, avec leurs formats de haute densité actuels ainsi que l’immense quantité de données à traiter, représentent des étapes complexes mais cruciales. Pour améliorer la qualité et la performance de ces deux étapes, nous avons proposé de nouvelles approches bioinformatiques pour la conception et l’analyse des biopuces à ADN en environnements parallèles. Ces approches généralistes et polyvalentes utilisent le calcul haute performance (HPC) et les nouvelles approches du génie logiciel inspirées de la modélisation, notamment l’ingénierie dirigée par les modèles (IDM) pour contourner les limites actuelles. Nous avons développé PhylGrid 2.0, une nouvelle approche distribuée sur grilles de calcul pour la sélection de sondes exploratoires pour biopuces phylogénétiques. Ce logiciel a alors été utilisé pour construire PhylOPDb: une base de données complète de sondes oligonucléotidiques pour l’étude des communautés procaryotiques. MetaExploArrays qui est un logiciel parallèle pour la détermination de sondes sur différentes architectures de calcul (un PC, un multiprocesseur, un cluster ou une grille de calcul), en utilisant une approche de méta-programmation et d’ingénierie dirigée par les modèles a alors été conçu pour apporter une flexibilité aux utilisateurs en fonction de leurs ressources matériel. PhylInterpret, quant à lui est un nouveau logiciel pour faciliter l’analyse des résultats d’hybridation des biopuces à ADN. PhylInterpret utilise les notions de la logique propositionnelle pour déterminer la composition en procaryotes d’échantillons métagénomiques. Enfin, une démarche d’ingénierie dirigée par les modèles pour la parallélisation de la traduction inverse d’oligopeptides pour le design des biopuces à ADN fonctionnelles a également été mise en place

Thèses en Ligne

HAL Clermont Université