Search CORE

733 research outputs found

Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization

Author: Bauer Markus
Klau Gunnar W
Reinert Knut
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background: The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. Results: We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. Conclusions: The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of inpu

CiteSeerX

Institutional Repository of the Freie Universität Berlin

Springer - Publisher Connector

Directory of Open Access Journals

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

PubMed Central

An enhanced RNA alignment benchmark for sequence alignment programs

Author: Mainz Indra
Steger Gerhard
Wilm Andreas
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The performance of alignment programs is traditionally tested on sets of protein sequences, of which a reference alignment is known. Conclusions drawn from such protein benchmarks do not necessarily hold for the RNA alignment problem, as was demonstrated in the first RNA alignment benchmark published so far. For example, the twilight zone – the similarity range where alignment quality drops drastically – starts at 60 % for RNAs in comparison to 20 % for proteins. In this study we enhance the previous benchmark. RESULTS: The RNA sequence sets in the benchmark database are taken from an increased number of RNA families to avoid unintended impact by using only a few families. The size of sets varies from 2 to 15 sequences to assess the influence of the number of sequences on program performance. Alignment quality is scored by two measures: one takes into account only nucleotide matches, the other measures structural conservation. The performance order of parameters – like nucleotide substitution matrices and gap-costs – as well as of programs is rated by rank tests. CONCLUSION: Most sequence alignment programs perform equally well on RNA sequence sets with high sequence identity, that is with an average pairwise sequence identity (APSI) above 75 %. Parameters for gap-open and gap-extension have a large influence on alignment quality lower than APSI ≤ 75 %; optimal parameter combinations are shown for several programs. The use of different 4 × 4 substitution matrices improved program performance only in some cases. The performance of iterative programs drastically increases with increasing sequence numbers and/or decreasing sequence identity, which makes them clearly superior to programs using a purely non-iterative, progressive approach. The best sequence alignment programs produce alignments of high quality down to APSI > 55 %; at lower APSI the use of sequence+structure alignment programs is recommended

Springer - Publisher Connector

PubMed Central

LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences

Author: Ficarra Elisa
Reinert Knut
Urgese Gianvito
Winkler Jörg
Publication venue
Publication date: 01/01/2022
Field of study

Background The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software. Results We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version. Conclusions With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases

Institutional Repository of the Freie Universität Berlin

LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences

Author: Ficarra Elisa
Reinert Knut
Urgese Gianvito
Winkler Jörg
Publication venue
Publication date: 01/01/2022
Field of study

Institutional Repository of the Freie Universität Berlin

A new graph-based method for pairwise global network alignment

Author: Klau G.W. (Gunnar)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

CWI's Institutional Repository

A new graph-based method for pairwise global network alignment

Author: Klau Gunnar W
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background In addition to component-based comparative approaches, <it>network alignments </it>provide the means to study conserved network topology such as common pathways and more complex network motifs. Yet, unlike in classical sequence alignment, the comparison of networks becomes computationally more challenging, as most meaningful assumptions instantly lead to <it>NP</it>-hard problems. Most previous algorithmic work on network alignments is heuristic in nature. Results We introduce the graph-based <it>maximum structural matching </it>formulation for pairwise global network alignment. We relate the formulation to previous work and prove <it>NP</it>-hardness of the problem. Based on the new formulation we build upon recent results in computational structural biology and present a novel Lagrangian relaxation approach that, in combination with a branch-and-bound method, computes provably optimal network alignments. The Lagrangian algorithm alone is a powerful heuristic method, which produces solutions that are often near-optimal and – unlike those computed by pure heuristics – come with a quality guarantee. Conclusion Computational experiments on the alignment of protein-protein interaction networks and on the classification of metabolic subnetworks demonstrate that the new method is reasonably fast and has advantages over pure heuristics. Our software tool is freely available as part of the L<smcaps>I</smcaps>SA library.</p

Springer - Publisher Connector

CWI's Institutional Repository

Directory of Open Access Journals

PubMed Central

Upcoming challenges for multiple sequence alignment methods in the high-throughput era

Author: Abhiman
Armougom
Armougom
Battey
Bauer
Bernhart
Birney
Blackshields
Blackshields
Carsten Kemena
Cedric Notredame
Chandonia
Claude
Do
Doering
Dowell
Durbin
Eddy
Edgar
Edgar
Edgar
Fabian
Ferragina
Frazer
Gondro
Gotoh
Gotoh
Guindon
Hogeweg
Holm
Katoh
Kececioglu
Kolodny
Lassmann
Lassmann
Lassmann
Lee
Loytynoja
McClure
Morgenstern
Needleman
Notredame
Notredame
Notredame
Notredame
O'Sullivan
O'Sullivan
Pascarella
Paten
Pei
Pei
Pei
Pei
Pei
Raghava
Rausch
Reinert
Riaz
Shindyalov
Siebert
Simossis
Siva
Stebbings
Stoye
Subramanian
Subramanian
Taylor
Thompson
Thompson
Thompson
Van Walle
Vingron
Wallace
Wallace
Wallace
Wang
Wheeler
Wilm
Wilm
Wong
Zhou
Publication venue: Oxford University Press
Publication date
Field of study

This review focuses on recent trends in multiple sequence alignment tools. It describes the latest algorithmic improvements including the extension of consistency-based methods to the problem of template-based multiple sequence alignments. Some results are presented suggesting that template-based methods are significantly more accurate than simpler alternative methods. The validation of existing methods is also discussed at length with the detailed description of recent results and some suggestions for future validation strategies. The last part of the review addresses future challenges for multiple sequence alignment methods in the genomic era, most notably the need to cope with very large sequences, the need to integrate large amounts of experimental data, the need to accurately align non-coding and non-transcribed sequences and finally, the need to integrate many alternative methods and approaches

Crossref

PubMed Central

Covariance models for RNA structure prediction

Author: Cuturello Francesca
Publication venue: place:Trieste
Publication date: 14/10/2019
Field of study

Many non-coding RNAs are known to play a role in the cell directly linked to their structure. Structure prediction based on the sole sequence is however a challenging task. On the other hand, thanks to the low cost of sequencing technologies, a very large number of homologous sequences are becoming available for many RNA families. In the protein community, it has emerged in the last decade the idea of exploiting the covariance of mutations within a family to predict the protein structure using the direct- coupling-analysis (DCA) method. The application of DCA to RNA systems has been limited so far. We here perform an assessment of the DCA method on 17 riboswitch families, comparing it with the commonly used mutual information analysis. We also compare different flavors of DCA, including mean-field, pseudo-likelihood, and a proposed stochastic procedure (Boltzmann learning) for solving exactly the DCA inverse problem. Boltzmann learning outperforms the other methods in predicting contacts observed in high resolution crystal structures. In order to enhance the prediction of both RNA secondary and tertiary contacts, we discuss the possibility to include of a number of informed priors in the estimation of the couplings for the DCA statistical model. We observe a systematic improvement of the DCA performance by embedding in the prior distribution the pairing probability matrices calculated using secondary-structure prediction algorithms

Sissa Digital Library

On the role of metaheuristic optimization in bioinformatics

Author: Benito Sergio
Calvet Laura
Juan Angel A
Prados Ferran
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/01/2022
Field of study

Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics

UCL Discovery

RiuNet