Search CORE

31 research outputs found

Order independent structural alignment of circularly permuted proteins

Author: Binkowski T. Andrew
DasGupta Bhaskar
Liang Jie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Circular permutation connects the N and C termini of a protein and concurrently cleaves elsewhere in the chain, providing an important mechanism for generating novel protein fold and functions. However, their in genomes is unknown because current detection methods can miss many occurances, mistaking random repeats as circular permutation. Here we develop a method for detecting circularly permuted proteins from structural comparison. Sequence order independent alignment of protein structures can be regarded as a special case of the maximum-weight independent set problem, which is known to be computationally hard. We develop an efficient approximation algorithm by repeatedly solving relaxations of an appropriate intermediate integer programming formulation, we show that the approximation ratio is much better then the theoretical worst case ratio of

r = 1/4

. Circularly permuted proteins reported in literature can be identified rapidly with our method, while they escape the detection by publicly available servers for structural alignment.Comment: 5 pages, 3 figures, Accepted by IEEE-EMBS 2004 Conference Proceeding

arXiv.org e-Print Archive

Crossref

Inapproximability of maximal strip recovery

Author: C. Zheng
C.H. Papadimitriou
E. Hazan
I. Dinur
J. Akiyama
J. Akiyama
L. Bulteau
L. Wang
M. Chlebík
M. Jiang
M. Jiang
P. Alimonti
R. Bar-Yehuda
R.B. Lyngsø
Z. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

In comparative genomic, the first step of sequence analysis is usually to decompose two or more genomes into syntenic blocks that are segments of homologous chromosomes. For the reliable recovery of syntenic blocks, noise and ambiguities in the genomic maps need to be removed first. Maximal Strip Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff for reliably recovering syntenic blocks from genomic maps in the midst of noise and ambiguities. Given

d

genomic maps as sequences of gene markers, the objective of \msr{d} is to find

d

subsequences, one subsequence of each genomic map, such that the total length of syntenic blocks in these subsequences is maximized. For any constant

d \ge 2

, a polynomial-time 2d-approximation for \msr{d} was previously known. In this paper, we show that for any

d \ge 2

, \msr{d} is APX-hard, even for the most basic version of the problem in which all gene markers are distinct and appear in positive orientation in each genomic map. Moreover, we provide the first explicit lower bounds on approximating \msr{d} for all

d \ge 2

. In particular, we show that \msr{d} is NP-hard to approximate within

\Omega(d/\log d)

. From the other direction, we show that the previous 2d-approximation for \msr{d} can be optimized into a polynomial-time algorithm even if

d

is not a constant but is part of the input. We then extend our inapproximability results to several related problems including \cmsr{d}, \gapmsr{\delta}{d}, and \gapcmsr{\delta}{d}.Comment: A preliminary version of this paper appeared in two parts in the Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC 2009) and the Proceedings of the 4th International Frontiers of Algorithmics Workshop (FAW 2010

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

Linear-vertex kernel for the problem of packing r-stars into a graph without long induced paths

Author: Anders Yeo
Bar-Yehuda
Barbero
Bejar
Bin Sheng
Cygan
Downey
Fellows
Fellows
Florian Barbero
Fomin
Gregory Gutin
Kirkpatrick
Kratsch
Lokshtanov
Mark Jones
Prieto
Wang
Publication venue: 'Elsevier BV'
Publication date: 13/10/2015
Field of study

Let integers

r\ge 2

and

d\ge 3

be fixed. Let

{\cal G}_d

be the set of graphs with no induced path on

d

vertices. We study the problem of packing

k

vertex-disjoint copies of

K_{1,r}

(

k\ge 2

) into a graph

G

from parameterized preprocessing, i.e., kernelization, point of view. We show that every graph

G\in {\cal G}_d

can be reduced, in polynomial time, to a graph

G'\in {\cal G}_d

with

O(k)

vertices such that

G

has at least

k

vertex-disjoint copies of

K_{1,r}

if and only if

G'

has. Such a result is known for arbitrary graphs

G

when

r=2

and we conjecture that it holds for every

r\ge 2

arXiv.org e-Print Archive

Crossref

Royal Holloway - Pure

HAL Descartes

Hal-Diderot

Core congestion is inherent in hyperbolic networks

Author: Chepoi Victor
Dragan Feodor F.
Vaxès Yann
Publication venue
Publication date: 12/07/2016
Field of study

We investigate the impact the negative curvature has on the traffic congestion in large-scale networks. We prove that every Gromov hyperbolic network

G

admits a core, thus answering in the positive a conjecture by Jonckheere, Lou, Bonahon, and Baryshnikov, Internet Mathematics, 7 (2011) which is based on the experimental observation by Narayan and Saniee, Physical Review E, 84 (2011) that real-world networks with small hyperbolicity have a core congestion. Namely, we prove that for every subset

X

of vertices of a

\delta

-hyperbolic graph

G

there exists a vertex

m

G

such that the disk

D(m,4 \delta)

of radius

4 \delta

centered at

m

intercepts at least one half of the total flow between all pairs of vertices of

X

, where the flow between two vertices

x,y\in X

is carried by geodesic (or quasi-geodesic)

(x,y)

-paths. A set

S

intercepts the flow between two nodes

x

and

y

S

intersect every shortest path between

x

and

y

. Differently from what was conjectured by Jonckheere et al., we show that

m

is not (and cannot be) the center of mass of

X

but is a node close to the median of

X

in the so-called injective hull of

X

. In case of non-uniform traffic between nodes of

X

(in this case, the unit flow exists only between certain pairs of nodes of

X

defined by a commodity graph

R

), we prove a primal-dual result showing that for any

\rho>5\delta

the size of a

\rho

-multi-core (i.e., the number of disks of radius

\rho

) intercepting all pairs of

R

is upper bounded by the maximum number of pairwise

(\rho-3\delta)

-apart pairs of

R

arXiv.org e-Print Archive

Crossref

HAL AMU

On tree-constrained matchings and generalizations

Author: Canzar S. (Stefan)
Elbassioni K.
Klau G.W. (Gunnar)
Mestre J.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2013
Field of study

CWI's Institutional Repository

A Graph-Theoretic Barcode Ordering Model for Linked-Reads

Author: Chauve Cedric
Chikhi Rayan
Dufresne Yoann
Lavenier Dominique
Marijon Pierre
Sun Chen
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 20th International Workshop on Algorithms in Bioinformatics (WABI 2020)
Publication date: 01/01/2020
Field of study

Considering a set of intervals on the real line, an interval graph records these intervals as nodes and their intersections as edges. Identifying (i.e. merging) pairs of nodes in an interval graph results in a multiple-interval graph. Given only the nodes and the edges of the multiple-interval graph without knowing the underlying intervals, we are interested in the following questions. Can one determine how many intervals correspond to each node? Can one compute a walk over the multiple-interval graph nodes that reflects the ordering of the original intervals? These questions are closely related to linked-read DNA sequencing, where barcodes are assigned to long molecules whose intersection graph forms an interval graph. Each barcode may correspond to multiple molecules, which complicates downstream analysis, and corresponds to the identification of nodes of the corresponding interval graph. Resolving the above graph-theoretic problems would facilitate analyses of linked-reads sequencing data, through enabling the conceptual separation of barcodes into molecules and providing, through the molecules order, a skeleton for accurately assembling the genome. Here, we propose a framework that takes as input an arbitrary intersection graph (such as an overlap graph of barcodes) and constructs a heuristic approximation of the ordering of the original intervals

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

HAL-Pasteur

Hal-Diderot

HAL-Rennes 1

On Tree-Constrained Matchings and Generalizations

Author: Canzar S. (Stefan)
Elbassioni K.
Klau G.W. (Gunnar)
Mestre J.
Publication venue: CWI
Publication date: 01/01/2011
Field of study

We consider the following \textsc{Tree-Constrained Bipartite Matching} problem: Given two rooted trees

T_1=(V_1,E_1)

T_2=(V_2,E_2)

and a weight function

w: V_1\times V_2 \mapsto \mathbb{R}_+

, find a maximum weight matching

\mathcal{M}

between nodes of the two trees, such that none of the matched nodes is an ancestor of another matched node in either of the trees. This generalization of the classical bipartite matching problem appears, for example, in the computational analysis of live cell video data. We show that the problem is

\mathcal{APX}

-hard and thus, unless

\mathcal{P} = \mathcal{NP}

, disprove a previous claim that it is solvable in polynomial time. Furthermore, we give a

2

-approximation algorithm based on a combination of the local ratio technique and a careful use of the structure of basic feasible solutions of a natural LP-relaxation, which we also show to have an integrality gap of

2-o(1)

. In the second part of the paper, we consider a natural generalization of the problem, where trees are replaced by partially ordered sets (posets). We show that the local ratio technique gives a

2k\rho

-approximation for the

k

-dimensional matching generalization of the problem, in which the maximum number of incomparable elements below (or above) any given element in each poset is bounded by

\rho

. We finally give an almost matching integrality gap example, and an inapproximability result showing that the dependence on

\rho

is most likely unavoidable

CWI's Institutional Repository

Recognizing Unit Multiple Intervals Is Hard

Author: Ardevol Martinez Virginia
Florian Sikora
Romeo Rizzi
Stephane Vialette
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum fur Informatik
Publication date: 01/01/2023
Field of study

Multiple interval graphs are a well-known generalization of interval graphs introduced in the 1970s to deal with situations arising naturally in scheduling and allocation. A d-interval is the union of d intervals on the real line, and a graph is a d-interval graph if it is the intersection graph of d-intervals. In particular, it is a unit d-interval graph if it admits a d-interval representation where every interval has unit length. Whereas it has been known for a long time that recognizing 2-interval graphs and other related classes such as 2-track interval graphs is NP-complete, the complexity of recognizing unit 2-interval graphs remains open. Here, we settle this question by proving that the recognition of unit 2-interval graphs is also NP-complete. Our proof technique uses a completely different approach from the other hardness results of recognizing related classes. Furthermore, we extend the result for unit d-interval graphs for any d ⩾ 2, which does not follow directly in graph recognition problems -as an example, it took almost 20 years to close the gap between d = 2 and d > 2 for the recognition of d-track interval graphs. Our result has several implications, including that recognizing (x, …, x) d-interval graphs and depth r unit 2-interval graphs is NP-complete for every x ⩾ 11 and every r ⩾ 4

Catalogo dei prodotti della ricerca