Search CORE

11 research outputs found

Genomes containing Duplicates are Hard to compare

Author: Chauve Cedric
Fertin Guillaume
Rizzi Romeo
Vialette Stéphane
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

International audienceIn this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure M between two genomes G1 and G2: the rst model, that we will call the matching model, consists in making a one-to-one correspondence between genes of G1 and genes of G2, in such a way that M is optimized. The second model, called the exemplar model, consists in keeping in G1 (resp. G2) exactly one copy of each gene, thus deleting all the other copies, in such a way that M is optimized. We present here dierent results concerning the algorithmic complexity of computing three dierent similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes NP-complete for each of them as soon as genomes contain duplicates. We show indeed that for common intervals, MAD and SAD, the problem is NP-complete when genes are duplicated in genomes, in both the exemplar and matching models. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-har

Genomes containing Duplicates are Hard to compare

Author: Chauve Cedric
Fertin Guillaume
Rizzi Romeo
Vialette Stéphane
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

INRIA a CCSD electronic archive server

On the Approximability of Comparing Genomes with Duplicates

Author: Angibaud Sébastien
Fertin Guillaume
Rusu Irena
Thévenin Annelyse
Vialette Stéphane
Publication venue: Brown University
Publication date: 01/01/2009
Field of study

International audienceA central problem in comparative genomics consists in computing a (dis-)similarity measure between two genomes, e.g. in order to construct a phylogenetic tree. A large number of such measures has been proposed in the recent past: number of reversals, number of breakpoints, number of common or conserved intervals etc. In their initial definitions, all these measures suppose that genomes contain no duplicates. However, we now know that genes can be duplicated within the same genome. One possible approach to overcome this difficulty is to establish a one-to-one correspondence (i.e. a matching) between genes of both genomes, where the correspondence is chosen in order to optimize the studied measure. Then, after a gene relabeling according to this matching and a deletion of the unmatched signed genes, two genomes without duplicates are obtained and the measure can be computed. In this paper, we are interested in three measures (number of breakpoints, number of common intervals and number of conserved intervals) and three models of matching (exemplar, intermediate and maximum matching models). We prove that, for each model and each measureM, computing a matching between two genomes that optimizes M is APX–hard. We show that this result remains true even for two genomes G1 and G2 such that G1 contains no duplicates and no gene of G2 appears more than twice. Therefore, our results extend those of [7, 10, 13]. Besides, in order to evaluate the possible existence of approximation algorithms concerning the number of breakpoints, we also study the complexity of the following decision problem: is there an exemplarization (resp. an intermediate matching, a maximum matching) that induces no breakpoint ? In particular, we extend a result of [13] by proving the problem to be NP–complete in the exemplar model for a new class of instances, we note that the problems are equivalent in the intermediate and the exemplar models and we show that the problem is in P in the maximum matching model. Finally, we focus on a fourth measure, closely related to the number of breakpoints: the number of adjacencies, for which we give several constant ratio approximation algorithms in the maximum matching model, in the case where genomes contain the same number of duplications of each gene

On the Approximability of Comparing Genomes with Duplicates

Author: Angibaud Sébastien
Fertin Guillaume
Rusu Irena
Thevenin Annelyse
Vialette Stéphane
Publication venue
Publication date: 01/01/2008
Field of study

A central problem in comparative genomics consists in computing a (dis-)similarity measure between two genomes, e.g. in order to construct a phylogeny. All the existing measures are defined on genomes without duplicates. However, we know that genes can be duplicated within the same genome. One possible approach to overcome this difficulty is to establish a one-to-one correspondence (i.e. a matching) between genes of both genomes, where the correspondence is chosen in order to optimize the studied measure. In this paper, we are interested in three measures (number of breakpoints, number of common intervals and number of conserved intervals) and three models of matching (exemplar, intermediate and maximum matching models). We prove that, for each model and each measure M, computing a matching between two genomes that optimizes M is APX-hard. We also study the complexity of the following problem: is there an exemplarization (resp. an intermediate/maximum matching) that induces no breakpoint? We prove the problem to be NP-Complete in the exemplar model for a new class of instances, and we show that the problem is in P in the maximum matching model. We also focus on a fourth measure: the number of adjacencies, for which we give several approximation algorithms in the maximum matching model, in the case where genomes contain the same number of duplications of each gene

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

HAL-Rennes 1

Analysis of Gene Order Evolution Beyond Single-Copy Genes

Author: A Bergeron
A Bergeron
A Siepel
A Xu
B Arden
B Ma
B Moret
B Vernot
C Chauve
C Zheng
C Zheng
C Zheng
C Zheng
C. Chauve
CM Zmasek
D Bader
D Bertrand
D Bertrand
D Durand
D Durand
D Fulkerson
D Sankoff
D Sankoff
D Sankoff
D Sankoff
D Sankoff
D Soltis
E Eichler
E Lyons
F Murat
F. Murat
G Blanc
G Blin
G Bourque
G Fertin
G Glusman
G Landau
G Shi
G Tesler
G Watterson
H Gavranovic
H Gavranović
I Wapinski
J Bowers
J Cotton
J Demuth
J Gordon
J Mixtacki
J Nadeau
J Salse
J-P Doyon
K Chen
K O’Brien
K Wolfe
L Zhang
L Zhang
M Alekseyev
M Goodman
M Hahn
M Lajoie
M Lajoie
M Lynch
M Muffato
M Sanderson
M Shannon
N El-Mabrouk
O Elemento
O Eulenstein
O Tremblay-Savard
P Bonizzoni
P Gorecki
P Pevzner
Q Zhu
R Guigó
R Hoberman
R LaRue
R Page
R Page
R Page
R Tatusov
R Warren
S Angibaud
S Hannenhalli
S Pham
S Schwartz
S Yancopoulos
S Yancopoulos
T Blomme
T Uno
T Vinař
V Bafna
V Shoja
W Fitch
W Li
WJ Kent
Z Adam
Z Fu
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Ancestral Genome Organization: An Alignment Approach

Author: Blanchette M.
Bourque G.
David Ardell
El-Mabrouk N.
Jiang M.
Krister Swenson
Nadia El-Mabrouk
Patrick Holloway
Pe'er I.
Swenson K.
Withers M.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Genomes containing duplicates are hard to compare

Author: Chauve Cedric
Fertin Guillaume
Rizzi Romeo
Vialette Stéphane
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

International audienc

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Catalogo dei prodotti della ricerca

Hal-Diderot

HAL - UPEC / UPEM

HAL-Rennes 1

Genomes containing duplicates are hard to compare

Author: Chauve Cedric
Fertin Guillaume
Rizzi Romeo
Vialette Stéphane
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

International audienc

Hal-Diderot

Genomes containing Duplicates are Hard to compare (Extended Abstract) ⋆

Author: Cedric Chauve
Guillaume Fertin
Romeo Rizzi
Stéphane Vialette
Publication venue
Publication date
Field of study

Abstract. In this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure M between two genomes G1 and G2: the first model, that we will call the matching model, consists in making a one-to-one correspondence between genes of G1 and genes of G2, in such a way that M is optimized. The second model, called the exemplar model, consists in keeping in G1 (resp. G2) exactly one copy of each gene, thus deleting all the other copies, in such a way that M is optimized. We present here different results concerning the algorithmic complexity of computing three different similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes NP-complete for each of them as soon as genomes contain duplicates. We show indeed that for common intervals, MAD and SAD, the problem is NP-complete when genes are duplicated in genomes, in both the exemplar and matching models. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-hard.

CiteSeerX