613 research outputs found
New Results on Optimizing Rooted Triplets Consistency
A set of phylogenetic trees with overlapping leaf sets is consistent if it can be merged without conflicts into a supertree. In this paper, we study the polynomial-time approximability of two related optimization problems called the maximum rooted triplets consistency problem (\textsc{MaxRTC}) and the minimum rooted triplets inconsistency problem (\textsc{MinRTI}) in which the input is a set of rooted triplets, and where the objectives are to find a largest cardinality subset of which is consistent and a smallest cardinality subset of whose removal from results in a consistent set, respectively. We first show that a simple modification to Wu’s Best-Pair-Merge-First heuristic [25] results in a bottom-up-based 3-approximation for \textsc{MaxRTC}. We then demonstrate how any approximation algorithm for \textsc{MinRTI} could be used to approximate \textsc{MaxRTC}, and thus obtain the first polynomial-time approximation algorithm for \textsc{MaxRTC} with approximation ratio smaller than 3. Next, we prove that f
Optimizing Phylogenetic Supertrees Using Answer Set Programming
The supertree construction problem is about combining several phylogenetic
trees with possibly conflicting information into a single tree that has all the
leaves of the source trees as its leaves and the relationships between the
leaves are as consistent with the source trees as possible. This leads to an
optimization problem that is computationally challenging and typically
heuristic methods, such as matrix representation with parsimony (MRP), are
used. In this paper we consider the use of answer set programming to solve the
supertree construction problem in terms of two alternative encodings. The first
is based on an existing encoding of trees using substructures known as
quartets, while the other novel encoding captures the relationships present in
trees through direct projections. We use these encodings to compute a
genus-level supertree for the family of cats (Felidae). Furthermore, we compare
our results to recent supertrees obtained by the MRP method.Comment: To appear in Theory and Practice of Logic Programming (TPLP),
Proceedings of ICLP 201
A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks
Recently much attention has been devoted to the construction of phylogenetic
networks which generalize phylogenetic trees in order to accommodate complex
evolutionary processes. Here we present an efficient, practical algorithm for
reconstructing level-1 phylogenetic networks - a type of network slightly more
general than a phylogenetic tree - from triplets. Our algorithm has been made
publicly available as the program LEV1ATHAN. It combines ideas from several
known theoretical algorithms for phylogenetic tree and network reconstruction
with two novel subroutines. Namely, an exponential-time exact and a greedy
algorithm both of which are of independent theoretical interest. Most
importantly, LEV1ATHAN runs in polynomial time and always constructs a level-1
network. If the data is consistent with a phylogenetic tree, then the algorithm
constructs such a tree. Moreover, if the input triplet set is dense and, in
addition, is fully consistent with some level-1 network, it will find such a
network. The potential of LEV1ATHAN is explored by means of an extensive
simulation study and a biological data set. One of our conclusions is that
LEV1ATHAN is able to construct networks consistent with a high percentage of
input triplets, even when these input triplets are affected by a low to
moderate level of noise
Trinets encode tree-child and level-2 phylogenetic networks
Phylogenetic networks generalize evolutionary trees, and are commonly used to
represent evolutionary histories of species that undergo reticulate
evolutionary processes such as hybridization, recombination and lateral gene
transfer. Recently, there has been great interest in trying to develop methods
to construct rooted phylogenetic networks from triplets, that is rooted trees
on three species. However, although triplets determine or encode rooted
phylogenetic trees, they do not in general encode rooted phylogenetic networks,
which is a potential issue for any such method. Motivated by this fact, Huber
and Moulton recently introduced trinets as a natural extension of rooted
triplets to networks. In particular, they showed that level-1 phylogenetic
networks are encoded by their trinets, and also conjectured that all
"recoverable" rooted phylogenetic networks are encoded by their trinets. Here
we prove that recoverable binary level-2 networks and binary tree-child
networks are also encoded by their trinets. To do this we prove two
decomposition theorems based on trinets which hold for all recoverable binary
rooted phylogenetic networks. Our results provide some additional evidence in
support of the conjecture that trinets encode all recoverable rooted
phylogenetic networks, and could also lead to new approaches to construct
phylogenetic networks from trinets
Phylogenetic CSPs are Approximation Resistant
We study the approximability of a broad class of computational problems --
originally motivated in evolutionary biology and phylogenetic reconstruction --
concerning the aggregation of potentially inconsistent (local) information
about items of interest, and we present optimal hardness of approximation
results under the Unique Games Conjecture. The class of problems studied here
can be described as Constraint Satisfaction Problems (CSPs) over infinite
domains, where instead of values or a fixed-size domain, the
variables can be mapped to any of the leaves of a phylogenetic tree. The
topology of the tree then determines whether a given constraint on the
variables is satisfied or not, and the resulting CSPs are called Phylogenetic
CSPs. Prominent examples of Phylogenetic CSPs with a long history and
applications in various disciplines include: Triplet Reconstruction, Quartet
Reconstruction, Subtree Aggregation (Forbidden or Desired). For example, in
Triplet Reconstruction, we are given triplets of the form
(indicating that ``items are more similar to each other than to '')
and we want to construct a hierarchical clustering on the items, that
respects the constraints as much as possible. Despite more than four decades of
research, the basic question of maximizing the number of satisfied constraints
is not well-understood. The current best approximation is achieved by
outputting a random tree (for triplets, this achieves a 1/3 approximation). Our
main result is that every Phylogenetic CSP is approximation resistant, i.e.,
there is no polynomial-time algorithm that does asymptotically better than a
(biased) random assignment. This is a generalization of the results in
Guruswami, Hastad, Manokaran, Raghavendra, and Charikar (2011), who showed that
ordering CSPs are approximation resistant (e.g., Max Acyclic Subgraph,
Betweenness).Comment: 45 pages, 11 figures, Abstract shortened for arxi
Building a Small and Informative Phylogenetic Supertree
We combine two fundamental, previously studied optimization problems related to the construction of phylogenetic trees called maximum rooted triplets consistency (MAXRTC) and minimally resolved supertree (MINRS) into a new problem, which we call q-maximum rooted triplets consistency (q-MAXRTC). The input to our new problem is a set R of resolved triplets (rooted, binary phylogenetic trees with three leaves each) and the objective is to find a phylogenetic tree with exactly q internal nodes that contains the largest possible number of triplets from R. We first prove that q-MAXRTC is NP-hard even to approximate within a constant ratio for every fixed q >= 2, and then develop various polynomial-time approximation algorithms for different values of q. Next, we show experimentally that representing a phylogenetic tree by one having much fewer nodes typically does not destroy too much triplet branching information. As an extreme example, we show that allowing only nine internal nodes is still sufficient to capture on average 80% of the rooted triplets from some recently published trees, each having between 760 and 3081 internal nodes. Finally, to demonstrate the algorithmic advantage of using trees with few internal nodes, we propose a new algorithm for computing the rooted triplet distance between two phylogenetic trees over a leaf label set of size n that runs in O(q n) time, where q is the number of internal nodes in the smaller tree, and is therefore faster than the currently best algorithms for the problem (with O(n log n) time complexity [SODA 2013, ESA 2017]) whenever q = o(log n)
- …