178,742 research outputs found
Unique Perfect Phylogeny Characterizations via Uniquely Representable Chordal Graphs
The perfect phylogeny problem is a classic problem in computational biology,
where we seek an unrooted phylogeny that is compatible with a set of
qualitative characters. Such a tree exists precisely when an intersection graph
associated with the character set, called the partition intersection graph, can
be triangulated using a restricted set of fill edges. Semple and Steel used the
partition intersection graph to characterize when a character set has a unique
perfect phylogeny. Bordewich, Huber, and Semple showed how to use the partition
intersection graph to find a maximum compatible set of characters. In this
paper, we build on these results, characterizing when a unique perfect
phylogeny exists for a subset of partial characters. Our characterization is
stated in terms of minimal triangulations of the partition intersection graph
that are uniquely representable, also known as ur-chordal graphs. Our
characterization is motivated by the structure of ur-chordal graphs, and the
fact that the block structure of minimal triangulations is mirrored in the
graph that has been triangulated
Fixed Parameter Polynomial Time Algorithms for Maximum Agreement and Compatible Supertrees
Consider a set of labels and a set of trees {\mathcal T} = \{{\mathcal
T}^{(1), {\mathcal T}^{(2), ..., {\mathcal T}^{(k) \$ where each tree
{\mathcal T}^{(i)L\mathcal T}{\mathcal T}k \geq 3kD$
of the trees are constant
Molecular footprint of drug-selective pressure in a human immunodeficiency virus transmission chain
Known human immunodeficiency virus (HIV) transmission histories are invaluable models for investigating the evolutionary and transmission dynamics of the virus and to assess the accuracy of phylogenetic reconstructions. Here we have characterized an HIV-1 transmission chain consisting of nine infected patients, almost all of whom were treated with antiviral drugs at later stages of infection. Partial pol and env gp41 regions of the HIV genome were directly sequenced from plasma viral RNA for at least one sample from each patient. Phylogenetic analyses in pol using likelihood methods inferred an evolutionary history not fully compatible with the known transmission history. This could be attributed to parallel evolution of drug resistance mutations resulting in the incorrect clustering of multidrug-resistant virus. On the other hand, a fully compatible phylogenetic tree was reconstructed from the env sequences. We were able to identify and quantify the molecular footprint of drug-selective pressure in pol using maximum likelihood inference under different codon substitution models. An increased fixation rate of mutations in the HIV population of the multidrug-resistant patient was demonstrated using molecular clock modeling. We show that molecular evolutionary analyses, guided by a known transmission history, can reveal the presence of confounding factors like natural selection and caution should be taken when accurate descriptions of HIV evolution are required.status: publishe
Circumstances in which parsimony but not compatibility will be provably misleading
Phylogenetic methods typically rely on an appropriate model of how data
evolved in order to infer an accurate phylogenetic tree. For molecular data,
standard statistical methods have provided an effective strategy for extracting
phylogenetic information from aligned sequence data when each site (character)
is subject to a common process. However, for other types of data (e.g.
morphological data), characters can be too ambiguous, homoplastic or saturated
to develop models that are effective at capturing the underlying process of
change. To address this, we examine the properties of a classic but neglected
method for inferring splits in an underlying tree, namely, maximum
compatibility. By adopting a simple and extreme model in which each character
either fits perfectly on some tree, or is entirely random (but it is not known
which class any character belongs to) we are able to derive exact and explicit
formulae regarding the performance of maximum compatibility. We show that this
method is able to identify a set of non-trivial homoplasy-free characters, when
the number of taxa is large, even when the number of random characters is
large. By contrast, we show that a method that makes more uniform use of all
the data --- maximum parsimony --- can provably estimate trees in which {\em
none} of the original homoplasy-free characters support splits.Comment: 37 pages, 2 figure
A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest
We give a 2-approximation algorithm for the Maximum Agreement Forest problem
on two rooted binary trees. This NP-hard problem has been studied extensively
in the past two decades, since it can be used to compute the Subtree
Prune-and-Regraft (SPR) distance between two phylogenetic trees. Our result
improves on the very recent 2.5-approximation algorithm due to Shi, Feng, You
and Wang (2015). Our algorithm is the first approximation algorithm for this
problem that uses LP duality in its analysis
Representing Partitions on Trees
In evolutionary biology, biologists often face the problem of constructing a phylogenetic tree on a set X of species from a multiset Π of partitions corresponding to various attributes of these species. One approach that is used to solve this problem is to try instead to associate a tree (or even a network) to the multiset ΣΠ consisting of all those bipartitions {A,X − A} with A a part of some partition in Π. The rational behind this approach is that a phylogenetic tree with leaf set X can be uniquely represented by the set of bipartitions of X induced by its edges. Motivated by these considerations, given a multiset Σ of bipartitions corresponding to a phylogenetic tree on X, in this paper we introduce and study the set P(Σ) consisting of those multisets of partitions Π of X with ΣΠ = Σ. More specifically, we characterize when P(Σ) is non-empty, and also identify some partitions in P(Σ) that are of maximum and minimum size. We also show that it is NP-complete to decide when P(Σ) is non-empty in case Σ is an arbitrary multiset of bipartitions of X. Ultimately, we hope that by gaining a better understanding of the mapping that takes an arbitrary partition system Π to the multiset ΣΠ, we will obtain new insights into the use of median networks and, more generally, split-networks to visualize sets of partitions
- …