Computing Robinson-Foulds supertree for two trees

Yu, Xilin

Computing Robinson-Foulds supertree for two trees

Authors: Xilin Yu
Publication date: 1 August 2019
Publisher

Abstract

Supertree problems are important in phylogeny estimation. Supertree construction takes in a set of input trees on subsets of species and aims to find a supertree containing all species subjective to some combinatorial or statistical criterion. As such, it can be used to combine trees estimated by different research projects, or to construct species trees from gene trees that may not contain all species, or to serve a part in divide-and-conquer pipelines that improve the scalability of large scale phylogeny estimation. Yet the most promising supertree methods, such as the popular Robinson-Foulds Supertree (RFS) methods, not only cannot guarantee an optimal solution but also are computationally intensive by themselves, as they are heuristics for NP-hard optimization problems. We present the first polynomial time algorithm to exactly solve the RFS problem on two binary input trees, and prove that finding the Robinson-Foulds Supertree of three input trees is NP-hard. We present GreedyRFS, a greedy heuristic for the Robinson-Foulds Supertree problem that operates by using our exact algorithm for RFS on pairs of trees, until all the trees are merged into a single supertree. Our experiments show that GreedyRFS has better accuracy than FastRFS, the leading heuristic for RFS, when the number of input trees is small, which is the natural case for use within divide-and-conquer pipelines

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

IDEALS @ Illinois

oai:www.ideals.illinois.edu:21...

Last time updated on 05/04/2020

Illinois Digital Environment for Access to Learning and Scholarship Repository

oai:www.ideals.illinois.edu:21...

Last time updated on 03/12/2019