4,467 research outputs found

    Circumstances in which parsimony but not compatibility will be provably misleading

    Full text link
    Phylogenetic methods typically rely on an appropriate model of how data evolved in order to infer an accurate phylogenetic tree. For molecular data, standard statistical methods have provided an effective strategy for extracting phylogenetic information from aligned sequence data when each site (character) is subject to a common process. However, for other types of data (e.g. morphological data), characters can be too ambiguous, homoplastic or saturated to develop models that are effective at capturing the underlying process of change. To address this, we examine the properties of a classic but neglected method for inferring splits in an underlying tree, namely, maximum compatibility. By adopting a simple and extreme model in which each character either fits perfectly on some tree, or is entirely random (but it is not known which class any character belongs to) we are able to derive exact and explicit formulae regarding the performance of maximum compatibility. We show that this method is able to identify a set of non-trivial homoplasy-free characters, when the number nn of taxa is large, even when the number of random characters is large. By contrast, we show that a method that makes more uniform use of all the data --- maximum parsimony --- can provably estimate trees in which {\em none} of the original homoplasy-free characters support splits.Comment: 37 pages, 2 figure

    Improved Lower Bounds on the Compatibility of Multi-State Characters

    Full text link
    We study a long standing conjecture on the necessary and sufficient conditions for the compatibility of multi-state characters: There exists a function f(r)f(r) such that, for any set CC of rr-state characters, CC is compatible if and only if every subset of f(r)f(r) characters of CC is compatible. We show that for every r≥2r \ge 2, there exists an incompatible set CC of ⌊r2⌋⋅⌈r2⌉+1\lfloor\frac{r}{2}\rfloor\cdot\lceil\frac{r}{2}\rceil + 1 rr-state characters such that every proper subset of CC is compatible. Thus, f(r)≥⌊r2⌋⋅⌈r2⌉+1f(r) \ge \lfloor\frac{r}{2}\rfloor\cdot\lceil\frac{r}{2}\rceil + 1 for every r≥2r \ge 2. This improves the previous lower bound of f(r)≥rf(r) \ge r given by Meacham (1983), and generalizes the construction showing that f(4)≥5f(4) \ge 5 given by Habib and To (2011). We prove our result via a result on quartet compatibility that may be of independent interest: For every integer n≥4n \ge 4, there exists an incompatible set QQ of ⌊n−22⌋⋅⌈n−22⌉+1\lfloor\frac{n-2}{2}\rfloor\cdot\lceil\frac{n-2}{2}\rceil + 1 quartets over nn labels such that every proper subset of QQ is compatible. We contrast this with a result on the compatibility of triplets: For every n≥3n \ge 3, if RR is an incompatible set of more than n−1n-1 triplets over nn labels, then some proper subset of RR is incompatible. We show this upper bound is tight by exhibiting, for every n≥3n \ge 3, a set of n−1n-1 triplets over nn taxa such that RR is incompatible, but every proper subset of RR is compatible

    Integration of Morphological Data into Molecular Phylogenetic Analysis: Toward the Identikit of the Stylasterid Ancestor

    Get PDF
    Stylasteridae is a hydroid family including 29 worldwide-distributed genera, all provided with a calcareous skeleton. They are abundant in shallow and deep waters and represent an important component of marine communities. In the present paper, we studied the evolution of ten morphological characters, currently used in stylasterid taxonomy, using a phylogenetic approach. Our results indicate that stylasterid morphology is highly plastic and that many events of independent evolution and reversion have occurred. Our analysis also allows sketching a possible identikit of the stylasterid ancestor. It had calcareous skeleton, reticulate-granular coenosteal texture, polyps randomly arranged, gastrostyle, and dactylopore spines, while lacking a gastropore lip and dactylostyles. If the ancestor had single or double/multiple chambered gastropore tube is uncertain. These data suggest that the ancestor was similar to the extant genera Cyclohelia and Stellapora. Our investigation is the first attempt to integrate molecular and morphological information to clarify the stylasterid evolutionary scenario and represents the first step to infer the stylasterid ancestor morphology. \ua9 2016 Puce et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

    Finding Optimal Tree Decompositions

    Get PDF
    The task of organizing a given graph into a structure called a tree decomposition is relevant in multiple areas of computer science. In particular, many NP-hard problems can be solved in polynomial time if a suitable tree decomposition of a graph describing the problem instance is given as a part of the input. This motivates the task of finding as good tree decompositions as possible, or ideally, optimal tree decompositions. This thesis is about finding optimal tree decompositions of graphs with respect to several notions of optimality. Each of the considered notions measures the quality of a tree decomposition in the context of an application. In particular, we consider a total of seven problems that are formulated as finding optimal tree decompositions: treewidth, minimum fill-in, generalized and fractional hypertreewidth, total table size, phylogenetic character compatibility, and treelength. For each of these problems we consider the BT algorithm of Bouchitté and Todinca as the method of finding optimal tree decompositions. The BT algorithm is well-known on the theoretical side, but to our knowledge the first time it was implemented was only recently for the 2nd Parameterized Algorithms and Computational Experiments Challenge (PACE 2017). The author’s implementation of the BT algorithm took the second place in the minimum fill-in track of PACE 2017. In this thesis we review and extend the BT algorithm and our implementation. In particular, we improve the eciency of the algorithm in terms of both theory and practice. We also implement the algorithm for each of the seven problems considered, introducing a novel adaptation of the algorithm for the maximum compatibility problem of phylogenetic characters. Our implementation outperforms alternative state-of-the-art approaches in terms of numbers of test instances solved on well-known benchmarks on minimum fill-in, generalized hypertreewidth, fractional hypertreewidth, total table size, and the maximum compatibility problem of phylogenetic characters. Furthermore, to our understanding the implementation is the first exact approach for the treelength problem

    Probabilistic Graphical Model Representation in Phylogenetics

    Get PDF
    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (1) reproducibility of an analysis, (2) model development and (3) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and non-specialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution
    • …
    corecore