Search CORE

Archivio della Ricerca - Università di Roma 3

Graph Drawing E-print Archive

Gene Order Phylogeny of the Genus Prochlorococcus

Author: AC Martiny
B Snel
BME Moret
BME Moret
BME Moret
E Bapteste
E Bapteste
E Belda
FD Ciccarelli
G Rocap
GC Kettler
HA Bouman
Haiwei Luo
J Felsenstein
JD Palmer
JH Degnan
Jian Shi
Jijun Tang
K Tamura
Konrad Scheffler
L-S Wang
LA Raubeson
LR Moore
LR Moore
M Blanchette
R Aziz
R Desper
RG Olmstead
Robert Friedman
SF Altschul
SR Gadagkar
William Arndt
ZI Johnson
Publication venue: Public Library of Science
Publication date: 03/12/2008
Field of study

Using gene order as a phylogenetic character has the potential to resolve previously unresolved species relationships. This character was used to resolve the evolutionary history within the genus Prochlorococcus, a group of marine cyanobacteria.Orthologous gene sets and their genomic positions were identified from 12 species of Prochlorococcus and 1 outgroup species of Synechococcus. From this data, inversion and breakpoint distance-based phylogenetic trees were computed by GRAPPA and FastME. Statistical support of the resulting topology was obtained by application of a 50% jackknife resampling technique. The result was consistent and congruent with nucleotide sequence-based and gene-content based trees. Also, a previously unresolved clade was resolved, that of MIT9211 and SS120.This is the first study to use gene order data to resolve a bacterial phylogeny at the genus level. It suggests that the technique is useful in resolving the Tree of Life

Public Library of Science (PLOS)

Rec-DCM-Eigen: Reconstructing a Less Parsimonious but More Accurate Tree in Shorter Time

Author: A Bhutkar
A Coghlan
A Coghlan
A Pothen
A Wei Xu
B Mohar
BME Moret
BME Moret
BME Moret
BME Moret
CA Stewart
Christian Schönbach
D Sankoff
DA Bader
David A. Bader
DH Huson
DH Huson
G Bourque
G Fertin
G Li
J Bergsten
J Tang
JA Hartigan
Jijun Tang
K Atteson
KM Swenson
M Bernt
M Blanchette
MD Hendy
MEJ Newman
N Saitou
ND Pattengale
Seunghwa Kang
Stephen W. Schaeffer
U von Luxburg
UW Roshan
W Arndt
WM Fitch
Y Lin
Y Lin
Publication venue: Public Library of Science
Publication date
Field of study

Maximum parsimony (MP) methods aim to reconstruct the phylogeny of extant species by finding the most parsimonious evolutionary scenario using the species' genome data. MP methods are considered to be accurate, but they are also computationally expensive especially for a large number of species. Several disk-covering methods (DCMs), which decompose the input species to multiple overlapping subgroups (or disks), have been proposed to solve the problem in a divide-and-conquer way

Running Experiments with Confidence and Sanity

Author: BME Moret
C Boettiger
C McGeoch
CC McGeoch
CS Collberg
GM Kurtzer
P Buneman
PJ Guo
R Rampin
T Bartz-Beielstein
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Analyzing data from large experimental suites is a daily task for anyone doing experimental algorithmics. In this paper we report on several approaches we tried for this seemingly mundane task in a similarity search setting, reflecting on the challenges it poses. We conclude by proposing a workflow, which can be implemented using several tools, that allows to analyze experimental data with confidence. The extended version of this paper and the support code are provided at https://github.com/Cecca/running-experiments

The IT University of Copenhagen's Repository

Archivio istituzionale della ricerca - Università di Padova

Locating a Tree in a Phylogenetic Network in Quadratic Time

Author: BME Moret
G Cardona
IA Kanj
JM Chan
K McBreen
L Iersel van
L Nakhleh
L Parida
L Wang
P Jenkins
T Dagan
T Marcussen
TJ Treangen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/02/2015
Field of study

International audienceA fundamental problem in the study of phylogenetic networks is to determine whether or not a given phylogenetic network contains a given phylogenetic tree. We develop a quadratic-time algorithm for this problem for binary nearly-stable phylogenetic networks. We also show that the number of reticulations in a reticulation visible or nearly stable phylogenetic network is bounded from above by a function linear in the number of taxa

arXiv.org e-Print Archive

HAL-Ecole des Ponts ParisTech

Hal-Diderot

HAL - UPEC / UPEM

The approximability of the String Barcoding problem

Author: B DasGupta
B DasGupta
BME Moret
Giuseppe Lancia
J Borneman
KMJ De Bontridder
MR Garey
P Berman
RG Downey
RM Karp
Romeo Rizzi
S Rash
TH Cormen
U Feige
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

The String Barcoding (SBC) problem, introduced by Rash and Gusfield (RECOMB, 2002), consists in finding a minimum set of substrings that can be used to distinguish between all members of a set of given strings. In a computational biology context, the given strings represent a set of known viruses, while the substrings can be used as probes for an hybridization experiment via microarray. Eventually, one aims at the classification of new strings (unknown viruses) through the result of the hybridization experiment. In this paper we show that SBC is as hard to approximate as Set Cover. Furthermore, we show that the constrained version of SBC (with probes of bounded length) is also hard to approximate. These negative results are tight

Archivio istituzionale della ricerca - Università degli Studi di Udine

Catalogo dei prodotti della ricerca

Minimizing recombinations in consensus networks for phylogeographic studies

Author: Asif Javed
BME Moret
C Semple
D Gusfield
DH Huson
EO Wilson
Francesc Calafell
J Hein
Jaume Bertranpetit
L Parida
Laxmi Parida
MA Jobling
Marta Melé
S Arora
TH Cormen
V Vazirani
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background We address the problem of studying recombinational variations in (human) populations. In this paper, our focus is on one computational aspect of the general task: Given two networks <it>G</it>1 and <it>G</it>2, with both mutation and recombination events, defined on overlapping sets of extant units the objective is to compute a consensus network <it>G</it>3 with minimum number of additional recombinations. We describe a polynomial time algorithm with a guarantee that the number of computed new recombination events is within <it>ϵ </it>= <it>sz</it>(<it>G</it>1, <it>G</it>2) (function <it>sz </it>is a well-behaved function of the sizes and topologies of <it>G</it>1 and <it>G</it>2) of the optimal <it>number </it>of recombinations. To date, this is the best known result for a network consensus problem. Results Although the network consensus problem can be applied to a variety of domains, here we focus on structure of human populations. With our preliminary analysis on a segment of the human Chromosome X data we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. These results have been verified independently using traditional manual procedures. To the best of our knowledge, this is the first recombinations-based characterization of human populations. Conclusion We show that our mathematical model identifies recombination spots in the individual haplotypes; the aggregate of these spots over a set of haplotypes defines a recombinational landscape that has enough signal to detect continental as well as population divide based on a short segment of Chromosome X. In particular, we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. The agreement with mutation-based analysis can be viewed as an indirect validation of our results and the model. Since the model in principle gives us more information embedded in the networks, in our future work, we plan to investigate more non-traditional questions via these structures computed by our methodology.</p

arXiv.org e-Print Archive

UPF Digital Repository

ScholarlyCommons@Penn

A Note on Encodings of Phylogenetic Networks of Bounded Level

Author: A Batbedat
AWM Dress
BME Moret
BR Holland
C Choy
C Semple
C Semple
D Bryant
DH Huson
F Rosselló
G Cardona
G Cardona
G Cardona
HJ Bandelt
HJ Bandelt
HL Chan
IA Kanj
J Jansson
J Jansson
JP Barthélémy
Katharina T. Huber
L Iersel van
L Iersel van
M Arenas
M Fellows
Philippe Gambette
S Grünewald
SJ Willson
SJ Willson
V Moulton
YS Song
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Driven by the need for better models that allow one to shed light into the question how life's diversity has evolved, phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the well-studied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i.e. uniquely describe) the network that induces it? In this note, we present a complete answer to this question for the special case of a level-1 (phylogenetic) network by characterizing those level-1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. Given that this type of network forms the first layer of the rich hierarchy of level-k networks, k a non-negative integer, it is natural to wonder whether our arguments could be extended to members of that hierarchy for higher values for k. By giving examples, we show that this is not the case

CiteSeerX

University of East Anglia digital repository

HAL AMU

Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms

Author: Amit U Sinha
BME Moret
D Sankoff
DA Bader
DL Wheeler
EM Marcotte
G Andelfinger
G Bourque
G Tesler
Jaroslaw Meller
JH Nadeau
JL Bentley
KA Frazer
LD Stein
M Clamp
PA Pevzner
Q Peng
S Hannenhalli
T Hubbard
TF Deluca
X Pan
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. RESULTS: We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance). In particular, Cinteny provides: i) integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii) flexibility to adjust the parameters and re-compute the results on-the-fly; iii) ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at . CONCLUSION: Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances. Cinteny can also be used to interactively browse syntenic blocks conserved in multiple genomes, to facilitate genome annotation and validation of assemblies for newly sequenced genomes, and to construct and assess phylogenomic trees

Infoscience - École polytechnique fédérale de Lausanne

Refining transcriptional regulatory networks using network evolutionary models and gene histories

Author: A Bhan
A Crombach
A Stark
A Tanay
AL Barabási
Bernard ME Moret
BME Moret
C Roth
CT Harbison
D Durand
DM Hillis
G Bourque
J Kim
J Yu
KP Murphy
L Arvestad
M Kanehisa
MM Babu
MM Babu
N Friedman
N Friedman
R Wang
RDM Page
S Liang
SA Teichmann
SY Kim
T Akutsu
T Chen
T Pupko
X Zhang
X Zhang
Xiuwei Zhang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Computational inference of transcriptional regulatory networks remains a challenging problem, in part due to the lack of strong network models. In this paper we present evolutionary approaches to improve the inference of regulatory networks for a family of organisms by developing an evolutionary model for these networks and taking advantage of established phylogenetic relationships among these organisms. In previous work, we used a simple evolutionary model and provided extensive simulation results showing that phylogenetic information, combined with such a model, could be used to gain significant improvements on the performance of current inference algorithms. Results In this paper, we extend the evolutionary model so as to take into account gene duplications and losses, which are viewed as major drivers in the evolution of regulatory networks. We show how to adapt our evolutionary approach to this new model and provide detailed simulation results, which show significant improvement on the reference network inference algorithms. Different evolutionary histories for gene duplications and losses are studied, showing that our adapted approach is feasible under a broad range of conditions. We also provide results on biological data (<it>cis</it>-regulatory modules for 12 species of <it>Drosophila</it>), confirming our simulation results.</p