Search CORE

27 research outputs found

Efficient FPT algorithms for (strict) compatibility of unrooted phylogenetic trees

Author: AD Gordon
AV Aho
C Scornavacca
D Bryant
D Lokshtanov
F Delsuc
J Felsenstein
M Frick
M Ng
M Steel
OR Bininda-Emonds
R Diestel
T Kloks
W Maddison
Publication venue
Publication date: 01/01/2016
Field of study

In phylogenetics, a central problem is to infer the evolutionary relationships between a set of species

X

; these relationships are often depicted via a phylogenetic tree -- a tree having its leaves univocally labeled by elements of

X

and without degree-2 nodes -- called the "species tree". One common approach for reconstructing a species tree consists in first constructing several phylogenetic trees from primary data (e.g. DNA sequences originating from some species in

X

), and then constructing a single phylogenetic tree maximizing the "concordance" with the input trees. The so-obtained tree is our estimation of the species tree and, when the input trees are defined on overlapping -- but not identical -- sets of labels, is called "supertree". In this paper, we focus on two problems that are central when combining phylogenetic trees into a supertree: the compatibility and the strict compatibility problems for unrooted phylogenetic trees. These problems are strongly related, respectively, to the notions of "containing as a minor" and "containing as a topological minor" in the graph community. Both problems are known to be fixed-parameter tractable in the number of input trees

k

, by using their expressibility in Monadic Second Order Logic and a reduction to graphs of bounded treewidth. Motivated by the fact that the dependency on

k

of these algorithms is prohibitively large, we give the first explicit dynamic programming algorithms for solving these problems, both running in time

2^{O(k^2)} \cdot n

, where

n

is the total size of the input.Comment: 18 pages, 1 figur

arXiv.org e-Print Archive

Compatibility of unrooted phylogenetic trees is FPT

Author: David Bryant
Jens Lagergren
Publication venue
Publication date: 03/04/2020
Field of study

Abstract A collection of T 1 , T 2 , . . . , T k of unrooted, leaf labelled (phylogenetic) trees, all with different leaf sets, is said to be compatible if there exists a tree T such that each tree T i can be obtained from T by deleting leaves and contracting edges. Determining compatibility is NP-hard, and the fastest algorithm to date has worst case complexity of around (n k ) time, n being the number of leaves. Here, we present an O(nf (k)) algorithm, proving that compatibility of unrooted phylogenetic trees is fixed parameter tractable (FPT) with respect to the number k of trees

CiteSeerX

Treewidth of display graphs: bounds, brambles and applications

Author: Janssen Remie
Jones Mark
Kelk Steven
Stamoulis Georgios
Wu Taoyang
Publication venue: 'Journal of Graph Algorithms and Applications'
Publication date: 04/09/2018
Field of study

Phylogenetic trees and networks are leaf-labelled graphs used to model evolution. Display graphs are created by identifying common leaf labels in two or more phylogenetic trees or networks. The treewidth of such graphs is bounded as a function of many common dissimilarity measures between phylogenetic trees and this has been leveraged in fixed parameter tractability results. Here we further elucidate the properties of display graphs and their interaction with treewidth. We show that it is NP-hard to recognize display graphs, but that display graphs of bounded treewidth can be recognized in linear time. Next we show that if a phylogenetic network displays (i.e. topologically embeds) a phylogenetic tree, the treewidth of their display graph is bounded by a function of the treewidth of the original network (and also by various other parameters). In fact, using a bramble argument we show that this treewidth bound is sharp up to an additive term of 1. We leverage this bound to give an FPT algorithm, parameterized by treewidth, for determining whether a network displays a tree, which is an intensively-studied problem in the field. We conclude with a discussion on the future use of display graphs and treewidth in phylogenetics

arXiv.org e-Print Archive

Maastricht University Research Portal

TU Delft Repository

CWI's Institutional Repository

University of East Anglia digital repository

Composing dynamic programming tree-decomposition-based algorithms

Author: Baste Julien
Publication venue
Publication date: 29/04/2019
Field of study

Given two integers

\ell

and

p

as well as

\ell

graph classes

\mathcal{H}_1,\ldots,\mathcal{H}_\ell

, the problems

\mathsf{GraphPart}(\mathcal{H}_1, \ldots, \mathcal{H}_\ell,p)

\mathsf{VertPart}(\mathcal{H}_1, \ldots, \mathcal{H}_\ell)

, and

\mathsf{EdgePart}(\mathcal{H}_1, \ldots, \mathcal{H}_\ell)

ask, given graph

G

as input, whether

V(G)

V(G)

E(G)

respectively can be partitioned into

\ell

sets

S_1, \ldots, S_\ell

such that, for each

i

between

1

and

\ell

G[V_i] \in \mathcal{H}_i

G[V_i] \in \mathcal{H}_i

(V(G),S_i) \in \mathcal{H}_i

respectively. Moreover in

\mathsf{GraphPart}(\mathcal{H}_1, \ldots, \mathcal{H}_\ell,p)

, we request that the number of edges with endpoints in different sets of the partition is bounded by

p

. We show that if there exist dynamic programming tree-decomposition-based algorithms for recognizing the graph classes

\mathcal{H}_i

, for each

i

, then we can constructively create a dynamic programming tree-decomposition-based algorithms for

\mathsf{GraphPart}(\mathcal{H}_1, \ldots, \mathcal{H}_\ell,p)

\mathsf{VertPart}(\mathcal{H}_1, \ldots, \mathcal{H}_\ell)

, and

\mathsf{EdgePart}(\mathcal{H}_1, \ldots, \mathcal{H}_\ell)

. We show that, in some known cases, the obtained running times are comparable to those of the best know algorithms

arXiv.org e-Print Archive

Embedding Phylogenetic Trees in Networks of Low Treewidth

Author: Jones Mark
van Iersel Leo
Weller Mathias
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

Given a rooted, binary phylogenetic network and a rooted, binary phylogenetic tree, can the tree be embedded into the network? This problem, called Tree Containment, arises when validating networks constructed by phylogenetic inference methods. We present the first algorithm for (rooted) Tree Containment using the treewidth t of the input network N as parameter, showing that the problem can be solved in 2O(t2) |N| time and space.Optimizatio

TU Delft Repository

Dagstuhl Research Online Publication Server

Treewidth distance on phylogenetic trees

Author: Allen
Arnborg
Arnborg
Baroni
Baste
Blair
Bodlaender
Bodlaender
Bodlaender
Bodlaender
Bodlaender
Bodlaender
Bodlaender
Bordewich
Bordewich
Bryant
Chuzhoy
Cygan
Dell
Diestel
Ding
Downey
Dujmovic
Fernández-Baca
Fischer
Gogate
Grigoriev
Grohe
Gysel
Humphries
Huson
Kelk
Kelk
Kelk
Lagergren
Linz
Moulton
Semple
Semple
Steel
Vakati
Vakati
van Iersel
van Iersel
Whidden
Publication venue: 'Elsevier BV'
Publication date: 31/03/2017
Field of study

In this article we study the treewidth of the display graph, an auxiliary graph structure obtained from the fusion of phylogenetic (i.e., evolutionary) trees at their leaves. Earlier work has shown that the treewidth of the display graph is bounded if the trees are in some formal sense topologically similar. Here we further expand upon this relationship. We analyse a number of reduction rules, commonly used in the phylogenetics literature to obtain fixed parameter tractable algorithms. In some cases (the subtree reduction) the reduction rules behave similarly with respect to treewidth, while others (the cluster reduction) behave very differently, and the behaviour of the chain reduction is particularly intriguing because of its link with graph separators and forbidden minors. We also show that the gap between treewidth and Tree Bisection and Reconnect (TBR) distance can be infinitely large, and that unlike, for example, planar graphs the treewidth of the display graph can be as much as linear in its number of vertices. A number of other auxiliary results are given. We conclude with a discussion and list a number of open problems

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

University of East Anglia digital repository

Maximum agreement and compatible supertrees

Author: Berry Vincent
Nicolas François
Publication venue: Elsevier B.V.
Publication date: 01/01/2007
Field of study

AbstractGiven a set of leaf-labelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problems to the context of supertree inference, where input trees have non-identical leaf sets. This situation is of particular interest in phylogenetics. The resulting problems are called SMAST and SMCT.A sufficient condition is given that identifies cases where these problems can be solved by resorting to MAST and MCT as subproblems. This condition is met, for instance, when only two input trees are considered. Then we give algorithms for SMAST and SMCT that benefit from the link with the subtree problems. These algorithms run in time linear to the time needed to solve MAST, respectively MCT, on an instance of the same or smaller size.It is shown that arbitrary instances of SMAST and SMCT can be turned in polynomial time into instances composed of trees with a bounded number of leaves.SMAST is shown to be W[2]-hard when the considered parameter is the number of input leaves that have to be removed to obtain the agreement of the input trees. A similar result holds for SMCT. Moreover, the corresponding optimization problems, that is the complements of SMAST and SMCT, cannot be approximated in polynomial time within any constant factor, unless P=NP. These results also hold when the input trees have a bounded number of leaves.The presented results apply to both collections of rooted and unrooted trees

Elsevier - Publisher Connector

HAL Descartes

Hal-Diderot

Contributions to computational phylogenetics and algorithmic self-assembly

Author: Shutters Brad
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2013
Field of study

This dissertation addresses some of the algorithmic and combinatorial problems at the interface between biology and computation. In particular, it focuses on problems in both computational phylogenetics, an area of study in which computation is used to better understand evolutionary relationships, and algorithmic self-assembly, an area of study in which biological processes are used to perform computation. The first set of results investigate inferring phylogenetic trees from multi-state character data. We give a novel characterization of when a set of three-state characters has a perfect phylogeny and make progress on a long-standing conjecture regarding the compatibility of multi-state characters. The next set of results investigate inferring phylogenetic supertrees from collections of smaller input trees when the input trees do not fully agree on the relative positions of the taxa. Two approaches to dealing with such conflicting input trees are considered. The first is to contract a set of edges in the input trees so that the resulting trees have an agreement supertree. The second is to remove a set of taxa from the input trees so that the resulting trees have an agreement supertree. We give fixed-parameter tractable algorithms for both approaches. We then turn to the algorithmic self-assembly of fractal structures from DNA tiles and investigate approximating the Sierpinski triangle and the Sierpinski carpet with strict self-assembly. We prove tight bounds on approximating the Sierpinski triangle and exhibit a class of fractals that are generalizations of the Sierpinski carpet that can approximately self-assemble. We conclude by discussing some ideas for further research

Digital Repository @ Iowa State University (ISU)

Advancing Divide-And-Conquer Phylogeny Estimation Using Robinson-Foulds Supertrees

Author: Christensen Sarah
Le Thien
Molloy Erin K.
Warnow Tandy
Yu Xilin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 20th International Workshop on Algorithms in Bioinformatics (WABI 2020)
Publication date: 01/01/2020
Field of study

One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics for NP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a "supertree method". Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees is NP-hard. We also present GreedyRFS (a greedy heuristic that operates by repeatedly using Exact-RFS-2 on pairs of trees, until all the trees are merged into a single supertree). We evaluate Exact-RFS-2 and GreedyRFS, and show that they have better accuracy than the current leading heuristic for RFS

Dagstuhl Research Online Publication Server