Search CORE

4,467 research outputs found

Finding Most Compatible Phylogenetic Trees over Multi-State Characters

Author: Järvisalo Matti
Korhonen Tuukka
Publication venue: AAAI Press
Publication date: 01/01/2020
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Association for the Advancement of Artificial Intelligence: AAAI Publications

Circumstances in which parsimony but not compatibility will be provably misleading

Author: Scotland Robert W.
Steel Mike
Publication venue
Publication date: 01/01/2015
Field of study

Phylogenetic methods typically rely on an appropriate model of how data evolved in order to infer an accurate phylogenetic tree. For molecular data, standard statistical methods have provided an effective strategy for extracting phylogenetic information from aligned sequence data when each site (character) is subject to a common process. However, for other types of data (e.g. morphological data), characters can be too ambiguous, homoplastic or saturated to develop models that are effective at capturing the underlying process of change. To address this, we examine the properties of a classic but neglected method for inferring splits in an underlying tree, namely, maximum compatibility. By adopting a simple and extreme model in which each character either fits perfectly on some tree, or is entirely random (but it is not known which class any character belongs to) we are able to derive exact and explicit formulae regarding the performance of maximum compatibility. We show that this method is able to identify a set of non-trivial homoplasy-free characters, when the number

n

of taxa is large, even when the number of random characters is large. By contrast, we show that a method that makes more uniform use of all the data --- maximum parsimony --- can provably estimate trees in which {\em none} of the original homoplasy-free characters support splits.Comment: 37 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Improved Lower Bounds on the Compatibility of Multi-State Characters

Author: Fernández-Baca David
Shutters Brad
Vakati Sudheer
Publication venue
Publication date: 01/01/2012
Field of study

We study a long standing conjecture on the necessary and sufficient conditions for the compatibility of multi-state characters: There exists a function

f(r)

such that, for any set

C

r

-state characters,

C

is compatible if and only if every subset of

f(r)

characters of

C

is compatible. We show that for every

r \ge 2

, there exists an incompatible set

C

\lfloor\frac{r}{2}\rfloor\cdot\lceil\frac{r}{2}\rceil + 1

r

-state characters such that every proper subset of

C

is compatible. Thus,

f(r) \ge \lfloor\frac{r}{2}\rfloor\cdot\lceil\frac{r}{2}\rceil + 1

for every

r \ge 2

. This improves the previous lower bound of

f(r) \ge r

given by Meacham (1983), and generalizes the construction showing that

f(4) \ge 5

given by Habib and To (2011). We prove our result via a result on quartet compatibility that may be of independent interest: For every integer

n \ge 4

, there exists an incompatible set

Q

\lfloor\frac{n-2}{2}\rfloor\cdot\lceil\frac{n-2}{2}\rceil + 1

quartets over

n

labels such that every proper subset of

Q

is compatible. We contrast this with a result on the compatibility of triplets: For every

n \ge 3

, if

R

is an incompatible set of more than

n-1

triplets over

n

labels, then some proper subset of

R

is incompatible. We show this upper bound is tight by exhibiting, for every

n \ge 3

, a set of

n-1

triplets over

n

taxa such that

R

is incompatible, but every proper subset of

R

is compatible

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.

Author: Chan Michelle M
Hussmann Jeffrey A
Jones Matthew G
Khodaverdian Alex
Quinn Jeffrey J
Wang Robert
Weissman Jonathan S
Xu Chenling
Yosef Nir
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia

eScholarship - University of California

Integration of Morphological Data into Molecular Phylogenetic Analysis: Toward the Identikit of the Stylasterid Ancestor

Author: Negrisolo ENRICO MASSIMILIANO
Pica Daniela
Puce Stefania
Schiaparelli Stefano
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

Stylasteridae is a hydroid family including 29 worldwide-distributed genera, all provided with a calcareous skeleton. They are abundant in shallow and deep waters and represent an important component of marine communities. In the present paper, we studied the evolution of ten morphological characters, currently used in stylasterid taxonomy, using a phylogenetic approach. Our results indicate that stylasterid morphology is highly plastic and that many events of independent evolution and reversion have occurred. Our analysis also allows sketching a possible identikit of the stylasterid ancestor. It had calcareous skeleton, reticulate-granular coenosteal texture, polyps randomly arranged, gastrostyle, and dactylopore spines, while lacking a gastropore lip and dactylostyles. If the ancestor had single or double/multiple chambered gastropore tube is uncertain. These data suggest that the ancestor was similar to the extant genera Cyclohelia and Stellapora. Our investigation is the first attempt to integrate molecular and morphological information to clarify the stylasterid evolutionary scenario and represents the first step to infer the stylasterid ancestor morphology. \ua9 2016 Puce et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Directory of Open Access Journals

PubMed Central

IRIS UniversitÃ Politecnica delle Marche

Archivio istituzionale della ricerca - Università di Genova

Archivio istituzionale della ricerca - Università di Padova

FigShare

Finding Optimal Tree Decompositions

Author: Korhonen Tuukka
Publication venue: Helsingfors universitet
Publication date: 01/01/2020
Field of study

The task of organizing a given graph into a structure called a tree decomposition is relevant in multiple areas of computer science. In particular, many NP-hard problems can be solved in polynomial time if a suitable tree decomposition of a graph describing the problem instance is given as a part of the input. This motivates the task of finding as good tree decompositions as possible, or ideally, optimal tree decompositions. This thesis is about finding optimal tree decompositions of graphs with respect to several notions of optimality. Each of the considered notions measures the quality of a tree decomposition in the context of an application. In particular, we consider a total of seven problems that are formulated as finding optimal tree decompositions: treewidth, minimum fill-in, generalized and fractional hypertreewidth, total table size, phylogenetic character compatibility, and treelength. For each of these problems we consider the BT algorithm of Bouchitté and Todinca as the method of finding optimal tree decompositions. The BT algorithm is well-known on the theoretical side, but to our knowledge the first time it was implemented was only recently for the 2nd Parameterized Algorithms and Computational Experiments Challenge (PACE 2017). The author’s implementation of the BT algorithm took the second place in the minimum fill-in track of PACE 2017. In this thesis we review and extend the BT algorithm and our implementation. In particular, we improve the eciency of the algorithm in terms of both theory and practice. We also implement the algorithm for each of the seven problems considered, introducing a novel adaptation of the algorithm for the maximum compatibility problem of phylogenetic characters. Our implementation outperforms alternative state-of-the-art approaches in terms of numbers of test instances solved on well-known benchmarks on minimum fill-in, generalized hypertreewidth, fractional hypertreewidth, total table size, and the maximum compatibility problem of phylogenetic characters. Furthermore, to our understanding the implementation is the first exact approach for the treelength problem

Helsingin yliopiston digitaalinen arkisto

Probabilistic Graphical Model Representation in Phylogenetics

Author: Boussau Bastien
Heath Tracy A.
Huelsenbeck John P.
Höhna Sebastian
Landis Michael J.
Ronquist Fredrik
Publication venue
Publication date: 09/12/2013
Field of study

Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (1) reproducibility of an analysis, (2) model development and (3) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and non-specialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution

arXiv.org e-Print Archive

KU ScholarWorks

PubMed Central