20 research outputs found
A simple fixed parameter tractable algorithm for computing the hybridization number of two (not necessarily binary) trees
Here we present a new fixed parameter tractable algorithm to compute the
hybridization number r of two rooted, not necessarily binary phylogenetic trees
on taxon set X in time (6^r.r!).poly(n)$, where n=|X|. The novelty of this
approach is its use of terminals, which are maximal elements of a natural
partial order on X, and several insights from the softwired clusters
literature. This yields a surprisingly simple and practical bounded-search
algorithm and offers an alternative perspective on the underlying combinatorial
structure of the hybridization number problem
On Computing the Maximum Parsimony Score of a Phylogenetic Network
Phylogenetic networks are used to display the relationship of different
species whose evolution is not treelike, which is the case, for instance, in
the presence of hybridization events or horizontal gene transfers. Tree
inference methods such as Maximum Parsimony need to be modified in order to be
applicable to networks. In this paper, we discuss two different definitions of
Maximum Parsimony on networks, "hardwired" and "softwired", and examine the
complexity of computing them given a network topology and a character. By
exploiting a link with the problem Multicut, we show that computing the
hardwired parsimony score for 2-state characters is polynomial-time solvable,
while for characters with more states this problem becomes NP-hard but is still
approximable and fixed parameter tractable in the parsimony score. On the other
hand we show that, for the softwired definition, obtaining even weak
approximation guarantees is already difficult for binary characters and
restricted network topologies, and fixed-parameter tractable algorithms in the
parsimony score are unlikely. On the positive side we show that computing the
softwired parsimony score is fixed-parameter tractable in the level of the
network, a natural parameter describing how tangled reticulate activity is in
the network. Finally, we show that both the hardwired and softwired parsimony
score can be computed efficiently using Integer Linear Programming. The
software has been made freely available
Phylogenetic Networks Do not Need to Be Complex: Using Fewer Reticulations to Represent Conflicting Clusters
Phylogenetic trees are widely used to display estimates of how groups of
species evolved. Each phylogenetic tree can be seen as a collection of
clusters, subgroups of the species that evolved from a common ancestor. When
phylogenetic trees are obtained for several data sets (e.g. for different
genes), then their clusters are often contradicting. Consequently, the set of
all clusters of such a data set cannot be combined into a single phylogenetic
tree. Phylogenetic networks are a generalization of phylogenetic trees that can
be used to display more complex evolutionary histories, including reticulate
events such as hybridizations, recombinations and horizontal gene transfers.
Here we present the new CASS algorithm that can combine any set of clusters
into a phylogenetic network. We show that the networks constructed by CASS are
usually simpler than networks constructed by other available methods. Moreover,
we show that CASS is guaranteed to produce a network with at most two
reticulations per biconnected component, whenever such a network exists. We
have implemented CASS and integrated it in the freely available Dendroscope
software
A quadratic kernel for computing the hybridization number of multiple trees
It has recently been shown that the NP-hard problem of calculating the
minimum number of hybridization events that is needed to explain a set of
rooted binary phylogenetic trees by means of a hybridization network is
fixed-parameter tractable if an instance of the problem consists of precisely
two such trees. In this paper, we show that this problem remains
fixed-parameter tractable for an arbitrarily large set of rooted binary
phylogenetic trees. In particular, we present a quadratic kernel
A Survey of Combinatorial Methods for Phylogenetic Networks
The evolutionary history of a set of species is usually described by a rooted phylogenetic tree. Although it is generally undisputed that bifurcating speciation events and descent with modifications are major forces of evolution, there is a growing belief that reticulate events also have a role to play. Phylogenetic networks provide an alternative to phylogenetic trees and may be more suitable for data sets where evolution involves significant amounts of reticulate events, such as hybridization, horizontal gene transfer, or recombination. In this article, we give an introduction to the topic of phylogenetic networks, very briefly describing the fundamental concepts and summarizing some of the most important combinatorial methods that are available for their computation
A tight kernel for computing the tree bisection and reconnection distance between two phylogenetic trees
In 2001 Allen and Steel showed that, if subtree and chain reduction rules
have been applied to two unrooted phylogenetic trees, the reduced trees will
have at most 28k taxa where k is the TBR (Tree Bisection and Reconnection)
distance between the two trees. Here we reanalyse Allen and Steel's
kernelization algorithm and prove that the reduced instances will in fact have
at most 15k-9 taxa. Moreover we show, by describing a family of instances which
have exactly 15k-9 taxa after reduction, that this new bound is tight. These
instances also have no common clusters, showing that a third
commonly-encountered reduction rule, the cluster reduction, cannot further
reduce the size of the kernel in the worst case. To achieve these results we
introduce and use "unrooted generators" which are analogues of rooted
structures that have appeared earlier in the phylogenetic networks literature.
Using similar argumentation we show that, for the minimum hybridization problem
on two rooted trees, 9k-2 is a tight bound (when subtree and chain reduction
rules have been applied) and 9k-4 is a tight bound (when, additionally, the
cluster reduction has been applied) on the number of taxa, where k is the
hybridization number of the two trees.Comment: One figure added, two small typos fixed. This version to appear in
SIDMA (SIAM Journal on Discrete Mathematics