114 research outputs found

    An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees

    Get PDF
    The relationship between two important problems in tree pattern matching, the largest common subtree and the smallest common supertree problems, is established by means of simple constructions, which allow one to obtain a largest common subtree of two trees from a smallest common supertree of them, and vice versa. These constructions are the same for isomorphic, homeomorphic, topological, and minor embeddings, they take only time linear in the size of the trees, and they turn out to have a clear algebraic meaning.Comment: 32 page

    Maximum agreement and compatible supertrees

    Get PDF
    AbstractGiven a set of leaf-labelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problems to the context of supertree inference, where input trees have non-identical leaf sets. This situation is of particular interest in phylogenetics. The resulting problems are called SMAST and SMCT.A sufficient condition is given that identifies cases where these problems can be solved by resorting to MAST and MCT as subproblems. This condition is met, for instance, when only two input trees are considered. Then we give algorithms for SMAST and SMCT that benefit from the link with the subtree problems. These algorithms run in time linear to the time needed to solve MAST, respectively MCT, on an instance of the same or smaller size.It is shown that arbitrary instances of SMAST and SMCT can be turned in polynomial time into instances composed of trees with a bounded number of leaves.SMAST is shown to be W[2]-hard when the considered parameter is the number of input leaves that have to be removed to obtain the agreement of the input trees. A similar result holds for SMCT. Moreover, the corresponding optimization problems, that is the complements of SMAST and SMCT, cannot be approximated in polynomial time within any constant factor, unless P=NP. These results also hold when the input trees have a bounded number of leaves.The presented results apply to both collections of rooted and unrooted trees

    Post-processing of phylogenetic trees

    Get PDF

    Robinson-Foulds Supertrees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Supertree methods synthesize collections of small phylogenetic trees with incomplete taxon overlap into comprehensive trees, or supertrees, that include all taxa found in the input trees. Supertree methods based on the well established Robinson-Foulds (RF) distance have the potential to build supertrees that retain much information from the input trees. Specifically, the RF supertree problem seeks a binary supertree that minimizes the sum of the RF distances from the supertree to the input trees. Thus, an RF supertree is a supertree that is consistent with the largest number of clusters (or clades) from the input trees.</p> <p>Results</p> <p>We introduce efficient, local search based, hill-climbing heuristics for the intrinsically hard RF supertree problem on rooted trees. These heuristics use novel non-trivial algorithms for the SPR and TBR local search problems which improve on the time complexity of the best known (naïve) solutions by a factor of Θ(<it>n</it>) and Θ(<it>n</it><sup>2</sup>) respectively (where <it>n </it>is the number of taxa, or leaves, in the supertree). We use an implementation of our new algorithms to examine the performance of the RF supertree method and compare it to matrix representation with parsimony (MRP) and the triplet supertree method using four supertree data sets. Not only did our RF heuristic provide fast estimates of RF supertrees in all data sets, but the RF supertrees also retained more of the information from the input trees (based on the RF distance) than the other supertree methods.</p> <p>Conclusions</p> <p>Our heuristics for the RF supertree problem, based on our new local search algorithms, make it possible for the first time to estimate large supertrees by directly optimizing the RF distance from rooted input trees to the supertrees. This provides a new and fast method to build accurate supertrees. RF supertrees may also be useful for estimating majority-rule(-) supertrees, which are a generalization of majority-rule consensus trees.</p

    Picoeukaryotic sequences in the Sargasso Sea metagenome

    Get PDF
    Many sequences from picoeukaryotes were found in DNA sequence data assembled from Sargasso seawater

    Minimal Phylogenetic Supertrees and Local Consensus Trees

    Get PDF
    The problem of constructing a minimally resolved phylogenetic supertree (i.e., having the smallest possible number of internal nodes) that contains all of the rooted triplets from a consistent set R is known to be NP-hard. In this paper, we prove that constructing a phylogenetic tree consistent with R that contains the minimum number of additional rooted triplets is also NP-hard, and develop exact, exponential-time algorithms for both problems. The new algorithms are applied to construct two variants of the local consensus tree; for any set S of phylogenetic trees over some leaf label set L, this gives a minimal phylogenetic tree over L that contains every rooted triplet present in all trees in S, where ``minimal\u27\u27 means either having the smallest possible number of internal nodes or the smallest possible number of rooted triplets. The second variant generalizes the RV-II tree, introduced by Kannan, Warnow, and Yooseph in 1998
    • …
    corecore