Search CORE

156 research outputs found

Fast Arc-Annotated Subsequence Matching in Linear Space

Author: D. Harel
G. Blin
G. Lin
I. Munro
J. Alber
P. Bille
P. Damaschke
P. Kilpeläinen
T. Kida
V. Bafna
W. Chen
Publication venue
Publication date: 01/01/2010
Field of study

An arc-annotated string is a string of characters, called bases, augmented with a set of pairs, called arcs, each connecting two bases. Given arc-annotated strings

P

and

Q

the arc-preserving subsequence problem is to determine if

P

can be obtained from

Q

by deleting bases from

Q

. Whenever a base is deleted any arc with an endpoint in that base is also deleted. Arc-annotated strings where the arcs are ``nested'' are a natural model of RNA molecules that captures both the primary and secondary structure of these. The arc-preserving subsequence problem for nested arc-annotated strings is basic primitive for investigating the function of RNA molecules. Gramm et al. [ACM Trans. Algorithms 2006] gave an algorithm for this problem using

O(nm)

time and space, where

m

and

n

are the lengths of

P

and

Q

, respectively. In this paper we present a new algorithm using

O(nm)

time and

O(n + m)

space, thereby matching the previous time bound while significantly reducing the space from a quadratic term to linear. This is essential to process large RNA molecules where the space is likely to be a bottleneck. To obtain our result we introduce several novel ideas which may be of independent interest for related problems on arc-annotated strings.Comment: To appear in Algoritmic

arXiv.org e-Print Archive

CiteSeerX

Crossref

Online Research Database In Technology

Comparing RNA structures using a full set of biologically relevant edit operations is intractable

Author: Blin Guillaume
Hamel Sylvie
Vialette Stéphane
Publication venue: HAL CCSD
Publication date: 15/12/2008
Field of study

7 pagesArc-annotated sequences are useful for representing structural information of RNAs and have been extensively used for comparing RNA structures in both terms of sequence and structural similarities. Among the many paradigms referring to arc-annotated sequences and RNA structures comparison (see \cite{IGMA_BliDenDul08} for more details), the most important one is the general edit distance. The problem of computing an edit distance between two non-crossing arc-annotated sequences was introduced in \cite{Evans99}. The introduced model uses edit operations that involve either single letters or pairs of letters (never considered separately) and is solvable in polynomial-time \cite{ZhangShasha:1989}. To account for other possible RNA structural evolutionary events, new edit operations, allowing to consider either silmutaneously or separately letters of a pair were introduced in \cite{jiangli}; unfortunately at the cost of computational tractability. It has been proved that comparing two RNA secondary structures using a full set of biologically relevant edit operations is {\sf\bf NP}-complete. Nevertheless, in \cite{DBLP:conf/spire/GuignonCH05}, the authors have used a strong combinatorial restriction in order to compare two RNA stem-loops with a full set of biologically relevant edit operations; which have allowed them to design a polynomial-time and space algorithm for comparing general secondary RNA structures. In this paper we will prove theoretically that comparing two RNA structures using a full set of biologically relevant edit operations cannot be done without strong combinatorial restrictions

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Hybrid techniques based on solving reduced problem instances for a longest common subsequence problem

Author: Blesa Aguilera Maria Josep
Blum Christian
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Finding the longest common subsequence of a given set of input strings is a relevant problem arising in various practical settings. One of these problems is the so-called longest arc-preserving common subsequence problem. This NP-hard combinatorial optimization problem was introduced for the comparison of arc-annotated ribonucleic acid (RNA) sequences. In this work we present an integer linear programming (ILP) formulation of the problem. As even in the context of rather small problem instances the application of a general purpose ILP solver is not viable due to the size of the model, we study alternative ways based on model reduction in order to take profit from this ILP model. First, we present a heuristic way for reducing the model, with the subsequent application of an ILP solver. Second, we propose the application of an iterative hybrid algorithm that makes use of an ILP solver for generating high quality solutions at each iteration. Experimental results concerning artificial and real problem instances show that the proposed techniques outperform an available technique from the literature.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

How to compare arc-annotated sequences: The alignment hierarchy

Author: Bal Harpreet Kaur
Baxter Gregory W
Brodzeli Zourab
Collins Stephen F
Sidiroglou Fotios
Wade Scott A
Publication venue: Springer Verlag
Publication date: 11/10/2006
Field of study

International audienceWe describe a new unifying framework to express comparison of arc-annotated sequences, which we call alignment of arc-annotated sequences. We first prove that this framework encompasses main existing models, which allows us to deduce complexity results for several cases from the literature. We also show that this framework gives rise to new relevant problems that have not been studied yet. We provide a thorough analysis of these novel cases by proposing two polynomial time algorithms and an NP-completeness proof. This leads to an almost exhaustive study of alignment of arc-annotated sequences

HAL - Lille 3

Victoria University Eprints Repository

How to compare arc-annotated sequences: The alignment hierarchy

Author: B. Ma
F. Bernhart
G. Lin
K. Zhang
K.C. Tai
M. Crochemore
S. Dulucq
S. Vialette
T. Jiang
T. Jiang
T.C. Biedl
Publication venue: Springer Verlag
Publication date: 01/01/2006
Field of study

CiteSeerX

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

A hybrid evolutionary algorithm based on solution merging for the longest arc-preserving common subsequence problem

Author: Blesa Aguilera Maria Josep
Blum Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

The longest arc-preserving common subsequence problem is an NP-hard combinatorial optimization problem from the field of computational biology. This problem finds applications, in particular, in the comparison of art-annotated ribonucleic acid (RNA) sequences. In this work we propose a simple, hybrid evolutionary algorithm to tackle this problem. The most important feature of this algorithm concerns a crossover operator based on solution merging. In solution merging, two or more solutions to the problem are merged, and an exact technique is used to find the best solution within this union. It is experimentally shown that the proposed algorithm outperforms a heuristic from the literature.Peer ReviewedPostprint (author's final draft

arXiv.org e-Print Archive

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

What Makes the Arc-Preserving Subsequence Problem Hard?

Author: Blin Guillaume
Fertin Guillaume
Rizzi Romeo
Vialette Stéphane
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2005
Field of study

International audienceGiven two arc-annotated sequences (S, P ) and (T, Q) representing RNA structures, the Arc-Preserving Subsequence (APS) problem asks whether (T, Q) can be obtained from (S, P ) by deleting some of its bases (together with their incident arcs, if any). In previous studies [3, 6], this problem has been naturally divided into subproblems reﬂecting intrinsic complexity of arc structures. We show that APS(Crossing, Plain) is NP-complete, thereby answering an open problem [6]. Furthermore, to get more insight into where actual border of APS hardness is, we reﬁne APS classical subproblems in much the same way as in [11] and give a complete categorization among various restrictions of APS problem complexity

Lightweight comparison of RNAs based on exact sequence–structure matches

Author: Allali
Altschul
Backofen
Bafna
Bahr
Bauer
Blin
Cannone
Evans
Gardner
Griffiths-Jones
Havgaard
Hentze
Hofacker
Hofacker
Huttenhofer
Höchsmann
Jiang
Jiang
Lin
Martineau
Mathews
Mathews
Michael Beckstette
Otto
Rolf Backofen
Sankoff
Sebastian Will
Serganov
Steffen Heyne
Torarinsson
Will
Wilm
Wilting
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Specific functions of ribonucleic acid (RNA) molecules are often associated with different motifs in the RNA structure. The key feature that forms such an RNA motif is the combination of sequence and structure properties. In this article, we introduce a new RNA sequence–structure comparison method which maintains exact matching substructures. Existing common substructures are treated as whole unit while variability is allowed between such structural motifs

CiteSeerX

Crossref

PubMed Central

Publications at Bielefeld University

A list of parameterized problems in bioinformatics

Author: Félix Ávila Liliana
García Chacón Alina
Serna Iglesias María José
Thilikos Touloupas Dimitrios
Publication venue
Publication date: 01/01/2006
Field of study

In this report we present a list of problems that originated in bionformatics. Our aim is to collect information on such problems that have been analyzed from the point of view of Parameterized Complexity. For every problem we give its definition and biological motivation together with known complexity results.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC