Search CORE

775 research outputs found

Maximum agreement and compatible supertrees

Author: Berry Vincent
Nicolas François
Publication venue: Elsevier B.V.
Publication date: 01/01/2007
Field of study

AbstractGiven a set of leaf-labelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problems to the context of supertree inference, where input trees have non-identical leaf sets. This situation is of particular interest in phylogenetics. The resulting problems are called SMAST and SMCT.A sufficient condition is given that identifies cases where these problems can be solved by resorting to MAST and MCT as subproblems. This condition is met, for instance, when only two input trees are considered. Then we give algorithms for SMAST and SMCT that benefit from the link with the subtree problems. These algorithms run in time linear to the time needed to solve MAST, respectively MCT, on an instance of the same or smaller size.It is shown that arbitrary instances of SMAST and SMCT can be turned in polynomial time into instances composed of trees with a bounded number of leaves.SMAST is shown to be W[2]-hard when the considered parameter is the number of input leaves that have to be removed to obtain the agreement of the input trees. A similar result holds for SMCT. Moreover, the corresponding optimization problems, that is the complements of SMAST and SMCT, cannot be approximated in polynomial time within any constant factor, unless P=NP. These results also hold when the input trees have a bounded number of leaves.The presented results apply to both collections of rooted and unrooted trees

Elsevier - Publisher Connector

HAL Descartes

Hal-Diderot

Fixed Parameter Polynomial Time Algorithms for Maximum Agreement and Compatible Supertrees

Author: Hoang Viet Tung
Sung Wing-Kin
Publication venue
Publication date: 01/01/2008
Field of study

Consider a set of labels

L

and a set of trees {\mathcal T} = \{{\mathcal T}^{(1), {\mathcal T}^{(2), ..., {\mathcal T}^{(k) \$ where each tree {\mathcal T}^{(i)

is distinctly leaf-labeled by some subset of

. One fundamental problem is to find the biggest tree (denoted as supertree) to represent

\mathcal T}

which minimizes the disagreements with the trees in

{\mathcal T}

under certain criteria. This problem finds applications in phylogenetics, database, and data mining. In this paper, we focus on two particular supertree problems, namely, the maximum agreement supertree problem (MASP) and the maximum compatible supertree problem (MCSP). These two problems are known to be NP-hard for

k \geq 3

. This paper gives the first polynomial time algorithms for both MASP and MCSP when both

and the maximum degree

D$ of the trees are constant

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

ScholarBank@NUS

Phylogenetic Trees and Their Analysis

Author: Ford Eric
Publication venue: CUNY Academic Works
Publication date: 01/02/2014
Field of study

Determining the best possible evolutionary history, the lowest-cost phylogenetic tree, to fit a given set of taxa and character sequences using maximum parsimony is an active area of research due to its underlying importance in understanding biological processes. As several steps in this process are NP-Hard when using popular, biologically-motivated optimality criteria, significant amounts of resources are dedicated to both both heuristics and to making exact methods more computationally tractable. We examine both phylogenetic data and the structure of the search space in order to suggest methods to reduce the number of possible trees that must be examined to find an exact solution for any given set of taxa and associated character data. Our work on four related problems combines theoretical insight with empirical study to improve searching of the tree space. First, we show that there is a Hamiltonian path through tree space for the most common tree metrics, answering Bryant\u27s Challenge for the minimal such path. We next examine the topology of the search space under various metrics, showing that some metrics have local maxima and minima even with perfect data, while some others do not. We further characterize conditions for which sequences simulated under the Jukes-Cantor model of evolution yield well-behaved search spaces. Next, we reduce the search space needed for an exact solution by splitting the set of characters into mutually-incompatible subsets of compatible characters, building trees based on the perfect phylogenies implied by these sets, and then searching in the neighborhoods of these trees. We validate this work empirically. Finally, we compare two approaches to the generalized tree alignment problem, or GTAP: Sequence alignment followed by tree search vs. Direct Optimization, on both biological and simulated data

City University of New York

Supertree construction by matrix representation with flip

Author: Chen Duhong
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2003
Field of study

Digital Repository @ Iowa State University (ISU)

EvoMiner: Frequent Subtree Mining in Phylogenetic Databases

Author: Deepak Akshay
Fernández-Baca David
McMahon Michelle
Sanderson Michael
Tirthapura Srikanta
Tirthapura Srikanta
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2011
Field of study

The problem of mining collections of trees to identify common patterns, called frequent subtrees (FSTs), arises often when trying to make sense of the results of phylogenetic analysis. FST mining generalizes the well-known maximum agreement subtree problem. Here we present EvoMiner, a new algorithm for mining frequent subtrees in collections of phylogenetic trees. EvoMiner is an Apriori-like level-wise method, which uses novel phylogeny-specific constant-time candidate generation scheme, an efficient fingerprinting-based technique for downward closure operation, and a lowest common ancestor based support counting step that requires neither costly subtree operations nor database traversal. As a result of these techniques, our algorithm achieves speed-ups of up to 100 times or more over phylominer, another algorithm for mining phylogenetic trees. EvoMiner can also work in vertical mining mode, to use less memory at the expense of speed

Digital Repository @ Iowa State University (ISU)

Contributions to computational phylogenetics and algorithmic self-assembly

Author: Shutters Brad
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2013
Field of study

This dissertation addresses some of the algorithmic and combinatorial problems at the interface between biology and computation. In particular, it focuses on problems in both computational phylogenetics, an area of study in which computation is used to better understand evolutionary relationships, and algorithmic self-assembly, an area of study in which biological processes are used to perform computation. The first set of results investigate inferring phylogenetic trees from multi-state character data. We give a novel characterization of when a set of three-state characters has a perfect phylogeny and make progress on a long-standing conjecture regarding the compatibility of multi-state characters. The next set of results investigate inferring phylogenetic supertrees from collections of smaller input trees when the input trees do not fully agree on the relative positions of the taxa. Two approaches to dealing with such conflicting input trees are considered. The first is to contract a set of edges in the input trees so that the resulting trees have an agreement supertree. The second is to remove a set of taxa from the input trees so that the resulting trees have an agreement supertree. We give fixed-parameter tractable algorithms for both approaches. We then turn to the algorithmic self-assembly of fractal structures from DNA tiles and investigate approximating the Sierpinski triangle and the Sierpinski carpet with strict self-assembly. We prove tight bounds on approximating the Sierpinski triangle and exhibit a class of fractals that are generalizations of the Sierpinski carpet that can approximately self-assemble. We conclude by discussing some ideas for further research

Digital Repository @ Iowa State University (ISU)