17 research outputs found
LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences
Background
The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software.
Results
We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version.
Conclusions
With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases
LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences
Background
The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software.
Results
We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version.
Conclusions
With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases
Discovering and Certifying Lower Bounds for the Online Bin Stretching Problem
There are several problems in the theory of online computation where tight
lower bounds on the competitive ratio are unknown and expected to be difficult
to describe in a short form. A good example is the Online Bin Stretching
problem, in which the task is to pack the incoming items online into bins while
minimizing the load of the largest bin. Additionally, the optimal load of the
entire instance is known in advance.
The contribution of this paper is twofold. First, we provide the first
non-trivial lower bounds for Online Bin Stretching with 6, 7 and 8 bins, and
increase the best known lower bound for 3 bins. We describe in detail the
algorithmic improvements which were necessary for the discovery of the new
lower bounds, which are several orders of magnitude more complex. The lower
bounds are presented in the form of directed acyclic graphs.
Second, we use the Coq proof assistant to formalize the Online Bin Stretching
problem and certify these large lower bound graphs. The script we propose
certified as well all the previously claimed lower bounds, which until now were
never formally proven. To the best of our knowledge, this is the first use of a
formal verification toolkit to certify a lower bound for an online problem
Classroom Examples of Robustness Problems in Geometric Computations
International audienceThe algorithms of computational geometry are designed for a machine model with exact real arithmetic. Substituting floating point arithmetic for the assumed real arithmetic may cause implementations to fail. Although this is well known, there is no comprehensive documentation of what can go wrong and why. In this extended abstract, we study a simple incremental algorithm for planar convex hulls and give examples which make the algorithm fail in all possible ways. We also show how to construct failure-examples semi-systematically and discuss the geometry of the floating point implementation of the orientation predicate. We hope that our work will be useful for teaching computational geometry. The full paper is available at http://hal.inria.fr/inria-00344310/. It contains further examples, more theory, and color pictures. We strongly recommend to read the full paper instead of this extended abstract
General Analysis Tool Box for Controlled Perturbation
The implementation of reliable and efficient geometric algorithms is a
challenging task. The reason is the following conflict: On the one hand,
computing with rounded arithmetic may question the reliability of programs
while, on the other hand, computing with exact arithmetic may be too expensive
and hence inefficient. One solution is the implementation of controlled
perturbation algorithms which combine the speed of floating-point arithmetic
with a protection mechanism that guarantees reliability, nonetheless.
This paper is concerned with the performance analysis of controlled
perturbation algorithms in theory. We answer this question with the
presentation of a general analysis tool box. This tool box is separated into
independent components which are presented individually with their interfaces.
This way, the tool box supports alternative approaches for the derivation of
the most crucial bounds. We present three approaches for this task.
Furthermore, we have thoroughly reworked the concept of controlled perturbation
in order to include rational function based predicates into the theory;
polynomial based predicates are included anyway. Even more we introduce
object-preserving perturbations. Moreover, the tool box is designed such that
it reflects the actual behavior of the controlled perturbation algorithm at
hand without any simplifying assumptions.Comment: 90 pages, 30 figure