49 research outputs found
How to compare arc-annotated sequences: The alignment hierarchy
International audienceWe describe a new unifying framework to express comparison of arc-annotated sequences, which we call alignment of arc-annotated sequences. We first prove that this framework encompasses main existing models, which allows us to deduce complexity results for several cases from the literature. We also show that this framework gives rise to new relevant problems that have not been studied yet. We provide a thorough analysis of these novel cases by proposing two polynomial time algorithms and an NP-completeness proof. This leads to an almost exhaustive study of alignment of arc-annotated sequences
A Faster Algorithm for Finding Minimum Tucker Submatrices.
A binary matrix has the Consecutive Ones Property (C1P) if its columns can be ordered in such a way that all 1s on each row are consecutive. Algorithmic issues of the C1P are central in computational molecular biology, in particular for physical mapping and ancestral genome reconstruction. In 1972, Tucker gave a characterization of matrices that have the C1P by a set of forbidden submatrices, and a substantial amount of research has been devoted to the problem of efficiently finding such a minimum size forbidden submatrix. This paper presents a new O(Δ^3 m 2 (mΔ + n^3)) time algorithm for this particular task for a m ×n binary matrix with at most Δ 1-entries per row, thereby improving the O(Δ^3 m 2(mn + n^3)) time algorithm of Dom et al. [17]
Common Structured Patterns in Linear Graphs: Approximation and Combinatorics
A linear graph is a graph whose vertices are linearly ordered. This linear ordering
allows pairs of disjoint edges to be either preceding (<), nesting ( N ) or crossing
( C ). Given a family of linear graphs, and a non-empty subset R 86 {<, N, C}, we are interested in the Maximum Common Structured Pattern (MCSP) problem:
find a maximum size edge-disjoint graph, with edge-pairs all comparable by one of the relations in R, that occurs as a subgraph in each of the linear graphs of the
family. The MCSP problem generalizes many structure-comparison and structure-prediction problems that arise in computational molecular biology.
We give tight hardness results for the MCSP problem for {<, C }-structured pat-
terns and { N, C }-structured patterns. Furthermore, we prove that the problem is
approximable within ratios: (i) 2H (k) for {<, C }-structured patterns, (ii) 1ak for
{ N, C }-structured patterns, and (iii) O( 1a(k log k) ) for {<, N, C }-structured patterns,
where k is the size of the optimal solution and H (k) is the k-th harmonic number. Also, we provide combinatorial results concerning the different types
of structured patterns that are of independent interest in their own right
New results for the 2-interval pattern problem
We present new results concerning the problem of nding a constrained pattern in a set of 2-intervals. Given a set of n 2-intervals D and a model R describing if two disjoint 2-intervals can be in precedence order (<), be allowed to nest (@) and/or be allowed to cross (G), the problem asks to nd a maximum cardinality subset D ′ ⊆ D such that any two 2-intervals in D ′ agree with R. We improve the time complexity of the best known algorithm for R = {@} by giving an optimal O(n log n) time algorithm. Also, we give a graph-like relaxation for R
Approximating the 2-Interval Pattern problem
We address the problem of approximating the 2-Interval Pattern problem over its various models and restrictions. This problem, which is motivated by RNA secondary structure prediction, asks to find a maximum cardinality subset of a 2-interval set with respect to some prespecified model. For each such model, we give varying approximation quality depending on the different possible restrictions imposed on the input 2-interval set