178,742 research outputs found

    Unique Perfect Phylogeny Characterizations via Uniquely Representable Chordal Graphs

    Full text link
    The perfect phylogeny problem is a classic problem in computational biology, where we seek an unrooted phylogeny that is compatible with a set of qualitative characters. Such a tree exists precisely when an intersection graph associated with the character set, called the partition intersection graph, can be triangulated using a restricted set of fill edges. Semple and Steel used the partition intersection graph to characterize when a character set has a unique perfect phylogeny. Bordewich, Huber, and Semple showed how to use the partition intersection graph to find a maximum compatible set of characters. In this paper, we build on these results, characterizing when a unique perfect phylogeny exists for a subset of partial characters. Our characterization is stated in terms of minimal triangulations of the partition intersection graph that are uniquely representable, also known as ur-chordal graphs. Our characterization is motivated by the structure of ur-chordal graphs, and the fact that the block structure of minimal triangulations is mirrored in the graph that has been triangulated

    Fixed Parameter Polynomial Time Algorithms for Maximum Agreement and Compatible Supertrees

    Get PDF
    Consider a set of labels LL and a set of trees {\mathcal T} = \{{\mathcal T}^{(1), {\mathcal T}^{(2), ..., {\mathcal T}^{(k) \$ where each tree {\mathcal T}^{(i)isdistinctlyleaflabeledbysomesubsetof is distinctly leaf-labeled by some subset of L.Onefundamentalproblemistofindthebiggesttree(denotedassupertree)torepresent. One fundamental problem is to find the biggest tree (denoted as supertree) to represent \mathcal T}whichminimizesthedisagreementswiththetreesin which minimizes the disagreements with the trees in {\mathcal T}undercertaincriteria.Thisproblemfindsapplicationsinphylogenetics,database,anddatamining.Inthispaper,wefocusontwoparticularsupertreeproblems,namely,themaximumagreementsupertreeproblem(MASP)andthemaximumcompatiblesupertreeproblem(MCSP).ThesetwoproblemsareknowntobeNPhardfor under certain criteria. This problem finds applications in phylogenetics, database, and data mining. In this paper, we focus on two particular supertree problems, namely, the maximum agreement supertree problem (MASP) and the maximum compatible supertree problem (MCSP). These two problems are known to be NP-hard for k \geq 3.ThispapergivesthefirstpolynomialtimealgorithmsforbothMASPandMCSPwhenboth. This paper gives the first polynomial time algorithms for both MASP and MCSP when both kandthemaximumdegree and the maximum degree D$ of the trees are constant

    Molecular footprint of drug-selective pressure in a human immunodeficiency virus transmission chain

    Get PDF
    Known human immunodeficiency virus (HIV) transmission histories are invaluable models for investigating the evolutionary and transmission dynamics of the virus and to assess the accuracy of phylogenetic reconstructions. Here we have characterized an HIV-1 transmission chain consisting of nine infected patients, almost all of whom were treated with antiviral drugs at later stages of infection. Partial pol and env gp41 regions of the HIV genome were directly sequenced from plasma viral RNA for at least one sample from each patient. Phylogenetic analyses in pol using likelihood methods inferred an evolutionary history not fully compatible with the known transmission history. This could be attributed to parallel evolution of drug resistance mutations resulting in the incorrect clustering of multidrug-resistant virus. On the other hand, a fully compatible phylogenetic tree was reconstructed from the env sequences. We were able to identify and quantify the molecular footprint of drug-selective pressure in pol using maximum likelihood inference under different codon substitution models. An increased fixation rate of mutations in the HIV population of the multidrug-resistant patient was demonstrated using molecular clock modeling. We show that molecular evolutionary analyses, guided by a known transmission history, can reveal the presence of confounding factors like natural selection and caution should be taken when accurate descriptions of HIV evolution are required.status: publishe

    Circumstances in which parsimony but not compatibility will be provably misleading

    Full text link
    Phylogenetic methods typically rely on an appropriate model of how data evolved in order to infer an accurate phylogenetic tree. For molecular data, standard statistical methods have provided an effective strategy for extracting phylogenetic information from aligned sequence data when each site (character) is subject to a common process. However, for other types of data (e.g. morphological data), characters can be too ambiguous, homoplastic or saturated to develop models that are effective at capturing the underlying process of change. To address this, we examine the properties of a classic but neglected method for inferring splits in an underlying tree, namely, maximum compatibility. By adopting a simple and extreme model in which each character either fits perfectly on some tree, or is entirely random (but it is not known which class any character belongs to) we are able to derive exact and explicit formulae regarding the performance of maximum compatibility. We show that this method is able to identify a set of non-trivial homoplasy-free characters, when the number nn of taxa is large, even when the number of random characters is large. By contrast, we show that a method that makes more uniform use of all the data --- maximum parsimony --- can provably estimate trees in which {\em none} of the original homoplasy-free characters support splits.Comment: 37 pages, 2 figure

    A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest

    Get PDF
    We give a 2-approximation algorithm for the Maximum Agreement Forest problem on two rooted binary trees. This NP-hard problem has been studied extensively in the past two decades, since it can be used to compute the Subtree Prune-and-Regraft (SPR) distance between two phylogenetic trees. Our result improves on the very recent 2.5-approximation algorithm due to Shi, Feng, You and Wang (2015). Our algorithm is the first approximation algorithm for this problem that uses LP duality in its analysis

    Representing Partitions on Trees

    Get PDF
    In evolutionary biology, biologists often face the problem of constructing a phylogenetic tree on a set X of species from a multiset Π of partitions corresponding to various attributes of these species. One approach that is used to solve this problem is to try instead to associate a tree (or even a network) to the multiset ΣΠ consisting of all those bipartitions {A,X − A} with A a part of some partition in Π. The rational behind this approach is that a phylogenetic tree with leaf set X can be uniquely represented by the set of bipartitions of X induced by its edges. Motivated by these considerations, given a multiset Σ of bipartitions corresponding to a phylogenetic tree on X, in this paper we introduce and study the set P(Σ) consisting of those multisets of partitions Π of X with ΣΠ = Σ. More specifically, we characterize when P(Σ) is non-empty, and also identify some partitions in P(Σ) that are of maximum and minimum size. We also show that it is NP-complete to decide when P(Σ) is non-empty in case Σ is an arbitrary multiset of bipartitions of X. Ultimately, we hope that by gaining a better understanding of the mapping that takes an arbitrary partition system Π to the multiset ΣΠ, we will obtain new insights into the use of median networks and, more generally, split-networks to visualize sets of partitions
    corecore