20 research outputs found

    Unique Perfect Phylogeny Characterizations via Uniquely Representable Chordal Graphs

    Full text link
    The perfect phylogeny problem is a classic problem in computational biology, where we seek an unrooted phylogeny that is compatible with a set of qualitative characters. Such a tree exists precisely when an intersection graph associated with the character set, called the partition intersection graph, can be triangulated using a restricted set of fill edges. Semple and Steel used the partition intersection graph to characterize when a character set has a unique perfect phylogeny. Bordewich, Huber, and Semple showed how to use the partition intersection graph to find a maximum compatible set of characters. In this paper, we build on these results, characterizing when a unique perfect phylogeny exists for a subset of partial characters. Our characterization is stated in terms of minimal triangulations of the partition intersection graph that are uniquely representable, also known as ur-chordal graphs. Our characterization is motivated by the structure of ur-chordal graphs, and the fact that the block structure of minimal triangulations is mirrored in the graph that has been triangulated

    Contributions to computational phylogenetics and algorithmic self-assembly

    Get PDF
    This dissertation addresses some of the algorithmic and combinatorial problems at the interface between biology and computation. In particular, it focuses on problems in both computational phylogenetics, an area of study in which computation is used to better understand evolutionary relationships, and algorithmic self-assembly, an area of study in which biological processes are used to perform computation. The first set of results investigate inferring phylogenetic trees from multi-state character data. We give a novel characterization of when a set of three-state characters has a perfect phylogeny and make progress on a long-standing conjecture regarding the compatibility of multi-state characters. The next set of results investigate inferring phylogenetic supertrees from collections of smaller input trees when the input trees do not fully agree on the relative positions of the taxa. Two approaches to dealing with such conflicting input trees are considered. The first is to contract a set of edges in the input trees so that the resulting trees have an agreement supertree. The second is to remove a set of taxa from the input trees so that the resulting trees have an agreement supertree. We give fixed-parameter tractable algorithms for both approaches. We then turn to the algorithmic self-assembly of fractal structures from DNA tiles and investigate approximating the Sierpinski triangle and the Sierpinski carpet with strict self-assembly. We prove tight bounds on approximating the Sierpinski triangle and exhibit a class of fractals that are generalizations of the Sierpinski carpet that can approximately self-assemble. We conclude by discussing some ideas for further research

    Fixed parameter algorithms for compatible and agreement supertree problems

    Get PDF
    Biologists represent evolutionary history of species through phylogenetic trees. Leaves of a phylogenetic tree represent the species and internal vertices represent the extinct ancestors. Given a collection of input phylogenetic trees, a common problem in computational biology is to build a supertree that captures the evolutionary history of all the species in the input trees, and is consistent with each of the input trees. In this document we study the tree compatibility and agreement supertree problems. Tree compatibility problem is NP-complete but has been shown to be fixed parameter tractable when parametrized by number of input trees. We characterize the compatible supertree problem in terms of triangulation of a structure called the display graph. We also give an alternative characterization in terms of cuts of the display graph. We show how these characterizations are related to characterization given in terms of triangulation of the edge label intersection graph. We then give a characterization of the agreement supertree problem. In real world data, consistent supertrees do not always exist. Inconsistencies can be dealt with by contraction of edges or removal of taxa. The agreement supertree edge contraction (AST-EC) problem asks if a collection of k rooted trees can be made to agree by contraction of at most p edges. Similarly, the agreement supertree taxon removal (AST-TR) problem asks if a collection of k rooted trees can be made to agree by removal of at most p taxa. We give fixed parameter algorithms for both cases when parametrized by k and p. We study the long standing conjecture on the perfect phylogeny problem; there exists a function f (r) such that a given collection C of r-state characters is compatible if and only if every f (r) subset of C is compatible. We will show that for r ≥ 2, f (r) ≥ lceil (r/2) rceil * lfloor(r/2)rfloor + 1

    Explaining Evolution via Constrained Persistent Perfect Phylogeny

    Get PDF
    BACKGROUND: The perfect phylogeny is an often used model in phylogenetics since it provides an efficient basic procedure for representing the evolution of genomic binary characters in several frameworks, such as for example in haplotype inference. The model, which is conceptually the simplest, is based on the infinite sites assumption, that is no character can mutate more than once in the whole tree. A main open problem regarding the model is finding generalizations that retain the computational tractability of the original model but are more flexible in modeling biological data when the infinite site assumption is violated because of e.g. back mutations. A special case of back mutations that has been considered in the study of the evolution of protein domains (where a domain is acquired and then lost) is persistency, that is the fact that a character is allowed to return back to the ancestral state. In this model characters can be gained and lost at most once. In this paper we consider the computational problem of explaining binary data by the Persistent Perfect Phylogeny model (referred as PPP) and for this purpose we investigate the problem of reconstructing an evolution where some constraints are imposed on the paths of the tree. RESULTS: We define a natural generalization of the PPP problem obtained by requiring that for some pairs (character, species), neither the species nor any of its ancestors can have the character. In other words, some characters cannot be persistent for some species. This new problem is called Constrained PPP (CPPP). Based on a graph formulation of the CPPP problem, we are able to provide a polynomial time solution for the CPPP problem for matrices whose conflict graph has no edges. Using this result, we develop a parameterized algorithm for solving the CPPP problem where the parameter is the number of characters. CONCLUSIONS: A preliminary experimental analysis shows that the constrained persistent perfect phylogeny model allows to explain efficiently data that do not conform with the classical perfect phylogeny model

    Approximately counting locally-optimal structures

    Full text link
    A locally-optimal structure is a combinatorial structure such as a maximal independent set that cannot be improved by certain (greedy) local moves, even though it may not be globally optimal. It is trivial to construct an independent set in a graph. It is easy to (greedily) construct a maximal independent set. However, it is NP-hard to construct a globally-optimal (maximum) independent set. In general, constructing a locally-optimal structure is somewhat more difficult than constructing an arbitrary structure, and constructing a globally-optimal structure is more difficult than constructing a locally-optimal structure. The same situation arises with listing. The differences between the problems become obscured when we move from listing to counting because nearly everything is #P-complete. However, we highlight an interesting phenomenon that arises in approximate counting, where the situation is apparently reversed. Specifically, we show that counting maximal independent sets is complete for #P with respect to approximation-preserving reductions, whereas counting all independent sets, or counting maximum independent sets is complete for an apparently smaller class, #RHΠ1\mathrm{\#RH}\Pi_1 which has a prominent role in the complexity of approximate counting. Motivated by the difficulty of approximately counting maximal independent sets in bipartite graphs, we also study the problem of approximately counting other locally-optimal structures that arise in algorithmic applications, particularly problems involving minimal separators and minimal edge separators. Minimal separators have applications via fixed-parameter-tractable algorithms for constructing triangulations and phylogenetic trees. Although exact (exponential-time) algorithms exist for listing these structures, we show that the counting problems are #P-complete with respect to both exact and approximation-preserving reductions.Comment: Accepted to JCSS, preliminary version accepted to ICALP 2015 (Track A

    Computer Science and Technology Series : XV Argentine Congress of Computer Science. Selected papers

    Get PDF
    CACIC'09 was the fifteenth Congress in the CACIC series. It was organized by the School of Engineering of the National University of Jujuy. The Congress included 9 Workshops with 130 accepted papers, 1 main Conference, 4 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2009 was organized following the traditional Congress format, with 9 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of three chairs of different Universities. The call for papers attracted a total of 267 submissions. An average of 2.7 review reports were collected for each paper, for a grand total of 720 review reports that involved about 300 different reviewers. A total of 130 full papers were accepted and 20 of them were selected for this book.Red de Universidades con Carreras en Informática (RedUNCI
    corecore