613 research outputs found

    New Results on Optimizing Rooted Triplets Consistency

    Get PDF
    A set of phylogenetic trees with overlapping leaf sets is consistent if it can be merged without conflicts into a supertree. In this paper, we study the polynomial-time approximability of two related optimization problems called the maximum rooted triplets consistency problem (\textsc{MaxRTC}) and the minimum rooted triplets inconsistency problem (\textsc{MinRTI}) in which the input is a set R\mathcal{R} of rooted triplets, and where the objectives are to find a largest cardinality subset of R\mathcal{R} which is consistent and a smallest cardinality subset of R\mathcal{R} whose removal from R\mathcal{R} results in a consistent set, respectively. We first show that a simple modification to Wu’s Best-Pair-Merge-First heuristic [25] results in a bottom-up-based 3-approximation for \textsc{MaxRTC}. We then demonstrate how any approximation algorithm for \textsc{MinRTI} could be used to approximate \textsc{MaxRTC}, and thus obtain the first polynomial-time approximation algorithm for \textsc{MaxRTC} with approximation ratio smaller than 3. Next, we prove that f

    Optimizing Phylogenetic Supertrees Using Answer Set Programming

    Full text link
    The supertree construction problem is about combining several phylogenetic trees with possibly conflicting information into a single tree that has all the leaves of the source trees as its leaves and the relationships between the leaves are as consistent with the source trees as possible. This leads to an optimization problem that is computationally challenging and typically heuristic methods, such as matrix representation with parsimony (MRP), are used. In this paper we consider the use of answer set programming to solve the supertree construction problem in terms of two alternative encodings. The first is based on an existing encoding of trees using substructures known as quartets, while the other novel encoding captures the relationships present in trees through direct projections. We use these encodings to compute a genus-level supertree for the family of cats (Felidae). Furthermore, we compare our results to recent supertrees obtained by the MRP method.Comment: To appear in Theory and Practice of Logic Programming (TPLP), Proceedings of ICLP 201

    A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks

    Full text link
    Recently much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks - a type of network slightly more general than a phylogenetic tree - from triplets. Our algorithm has been made publicly available as the program LEV1ATHAN. It combines ideas from several known theoretical algorithms for phylogenetic tree and network reconstruction with two novel subroutines. Namely, an exponential-time exact and a greedy algorithm both of which are of independent theoretical interest. Most importantly, LEV1ATHAN runs in polynomial time and always constructs a level-1 network. If the data is consistent with a phylogenetic tree, then the algorithm constructs such a tree. Moreover, if the input triplet set is dense and, in addition, is fully consistent with some level-1 network, it will find such a network. The potential of LEV1ATHAN is explored by means of an extensive simulation study and a biological data set. One of our conclusions is that LEV1ATHAN is able to construct networks consistent with a high percentage of input triplets, even when these input triplets are affected by a low to moderate level of noise

    Trinets encode tree-child and level-2 phylogenetic networks

    Full text link
    Phylogenetic networks generalize evolutionary trees, and are commonly used to represent evolutionary histories of species that undergo reticulate evolutionary processes such as hybridization, recombination and lateral gene transfer. Recently, there has been great interest in trying to develop methods to construct rooted phylogenetic networks from triplets, that is rooted trees on three species. However, although triplets determine or encode rooted phylogenetic trees, they do not in general encode rooted phylogenetic networks, which is a potential issue for any such method. Motivated by this fact, Huber and Moulton recently introduced trinets as a natural extension of rooted triplets to networks. In particular, they showed that level-1 phylogenetic networks are encoded by their trinets, and also conjectured that all "recoverable" rooted phylogenetic networks are encoded by their trinets. Here we prove that recoverable binary level-2 networks and binary tree-child networks are also encoded by their trinets. To do this we prove two decomposition theorems based on trinets which hold for all recoverable binary rooted phylogenetic networks. Our results provide some additional evidence in support of the conjecture that trinets encode all recoverable rooted phylogenetic networks, and could also lead to new approaches to construct phylogenetic networks from trinets

    Phylogenetic CSPs are Approximation Resistant

    Full text link
    We study the approximability of a broad class of computational problems -- originally motivated in evolutionary biology and phylogenetic reconstruction -- concerning the aggregation of potentially inconsistent (local) information about nn items of interest, and we present optimal hardness of approximation results under the Unique Games Conjecture. The class of problems studied here can be described as Constraint Satisfaction Problems (CSPs) over infinite domains, where instead of values {0,1}\{0,1\} or a fixed-size domain, the variables can be mapped to any of the nn leaves of a phylogenetic tree. The topology of the tree then determines whether a given constraint on the variables is satisfied or not, and the resulting CSPs are called Phylogenetic CSPs. Prominent examples of Phylogenetic CSPs with a long history and applications in various disciplines include: Triplet Reconstruction, Quartet Reconstruction, Subtree Aggregation (Forbidden or Desired). For example, in Triplet Reconstruction, we are given mm triplets of the form ij∣kij|k (indicating that ``items i,ji,j are more similar to each other than to kk'') and we want to construct a hierarchical clustering on the nn items, that respects the constraints as much as possible. Despite more than four decades of research, the basic question of maximizing the number of satisfied constraints is not well-understood. The current best approximation is achieved by outputting a random tree (for triplets, this achieves a 1/3 approximation). Our main result is that every Phylogenetic CSP is approximation resistant, i.e., there is no polynomial-time algorithm that does asymptotically better than a (biased) random assignment. This is a generalization of the results in Guruswami, Hastad, Manokaran, Raghavendra, and Charikar (2011), who showed that ordering CSPs are approximation resistant (e.g., Max Acyclic Subgraph, Betweenness).Comment: 45 pages, 11 figures, Abstract shortened for arxi

    Building a Small and Informative Phylogenetic Supertree

    Get PDF
    We combine two fundamental, previously studied optimization problems related to the construction of phylogenetic trees called maximum rooted triplets consistency (MAXRTC) and minimally resolved supertree (MINRS) into a new problem, which we call q-maximum rooted triplets consistency (q-MAXRTC). The input to our new problem is a set R of resolved triplets (rooted, binary phylogenetic trees with three leaves each) and the objective is to find a phylogenetic tree with exactly q internal nodes that contains the largest possible number of triplets from R. We first prove that q-MAXRTC is NP-hard even to approximate within a constant ratio for every fixed q >= 2, and then develop various polynomial-time approximation algorithms for different values of q. Next, we show experimentally that representing a phylogenetic tree by one having much fewer nodes typically does not destroy too much triplet branching information. As an extreme example, we show that allowing only nine internal nodes is still sufficient to capture on average 80% of the rooted triplets from some recently published trees, each having between 760 and 3081 internal nodes. Finally, to demonstrate the algorithmic advantage of using trees with few internal nodes, we propose a new algorithm for computing the rooted triplet distance between two phylogenetic trees over a leaf label set of size n that runs in O(q n) time, where q is the number of internal nodes in the smaller tree, and is therefore faster than the currently best algorithms for the problem (with O(n log n) time complexity [SODA 2013, ESA 2017]) whenever q = o(log n)
    • …