Search CORE

52 research outputs found

Optimal Completion and Comparison of Incomplete Phylogenetic Trees Under Robinson-Foulds Distance

Author: Bansal Mukul S.
Yao Keegan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

PhyloFinder: An intelligent search engine for phylogenetic tree databases

Author: Bansal Mukul S
Burleigh J Gordon
Chen Duhong
Fernández-Baca David
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Bioinformatic tools are needed to store and access the rapidly growing phylogenetic data. These tools should enable users to identify existing phylogenetic trees containing a specified taxon or set of taxa and to compare a specified phylogenetic hypothesis to existing phylogenetic trees. Results PhyloFinder is an intelligent search engine for phylogenetic databases that we have implemented using trees from TreeBASE. It enables taxonomic queries, in which it identifies trees in the database containing the exact name of the query taxon and/or any synonymous taxon names, and it provides spelling suggestions for the query when there is no match. Additionally, PhyloFinder can identify trees containing descendants or direct ancestors of the query taxon. PhyloFinder also performs phylogenetic queries, in which it identifies trees that contain the query tree or topologies that are similar to the query tree. Conclusion PhyloFinder can enhance the utility of any tree database by providing tools for both taxonomic and phylogenetic queries as well as visualization tools that highlight the query results and provide links to NCBI and TBMap. An implementation of PhyloFinder using trees from TreeBASE is available from the web client application found in the availability and requirements section.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

iGTP: A software package for large-scale gene tree parsimony analysis

Author: Bansal Mukul S
Chaudhary Ruchi
Eulenstein Oliver
Fernández-Baca David
Wehe André
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The ever-increasing wealth of genomic sequence information provides an unprecedented opportunity for large-scale phylogenetic analysis. However, species phylogeny inference is obfuscated by incongruence among gene trees due to evolutionary events such as gene duplication and loss, incomplete lineage sorting (deep coalescence), and horizontal gene transfer. Gene tree parsimony (GTP) addresses this issue by seeking a species tree that requires the minimum number of evolutionary events to reconcile a given set of incongruent gene trees. Despite its promise, the use of gene tree parsimony has been limited by the fact that existing software is either not fast enough to tackle large data sets or is restricted in the range of evolutionary events it can handle. Results We introduce iGTP, a platform-independent software program that implements state-of-the-art algorithms that greatly speed up species tree inference under the duplication, duplication-loss, and deep coalescence reconciliation costs. iGTP significantly extends and improves the functionality and performance of existing gene tree parsimony software and offers advanced features such as building effective initial trees using stepwise leaf addition and the ability to have unrooted gene trees in the input. Moreover, iGTP provides a user-friendly graphical interface with integrated tree visualization software to facilitate analysis of the results. Conclusions iGTP enables, for the first time, gene tree parsimony analyses of thousands of genes from hundreds of taxa using the duplication, duplication-loss, and deep coalescence reconciliation costs, all from within a convenient graphical user interface.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Pareto-optimal phylogenetic tree reconciliation

Author: Bansal Mukul S.
Kellis Manolis
Libeskind-Hadas Ran
Wu Yi-Chieh
Wu Yi-Chieh
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/06/2014
Field of study

Motivation: Phylogenetic tree reconciliation is a widely used method for reconstructing the evolutionary histories of gene families and species, hosts and parasites and other dependent pairs of entities. Reconciliation is typically performed using maximum parsimony, in which each evolutionary event type is assigned a cost and the objective is to find a reconciliation of minimum total cost. It is generally understood that reconciliations are sensitive to event costs, but little is understood about the relationship between event costs and solutions. Moreover, choosing appropriate event costs is a notoriously difficult problem. Results: We address this problem by giving an efficient algorithm for computing Pareto-optimal sets of reconciliations, thus providing the first systematic method for understanding the relationship between event costs and reconciliations. This, in turn, results in new techniques for computing event support values and, for cophylogenetic analyses, performing robust statistical tests. We provide new software tools and demonstrate their use on a number of datasets from evolutionary genomic and cophylogenetic studies.National Science Foundation (U.S.) (CAREER award 0644282)University of Connecticut (Startup funds)Harvey Mudd College (R. Michael Shanahan Endowment

DSpace@MIT

Crossref

PubMed Central

Robinson-Foulds Supertrees

Author: David Fernández-baca
J Gordon Burleigh
Mukul S Bansal
Oliver Eulenstein
Robinson-foulds Supertrees
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Supertree methods synthesize collections of small phylogenetic trees with incomplete taxon overlap into comprehensive trees, or supertrees, that include all taxa found in the input trees. Supertree methods based on the well established Robinson-Foulds (RF) distance have the potential to build supertrees that retain much information from the input trees. Specifically, the RF supertree problem seeks a binary supertree that minimizes the sum of the RF distances from the supertree to the input trees. Thus, an RF supertree is a supertree that is consistent with the largest number of clusters (or clades) from the input trees. Results We introduce efficient, local search based, hill-climbing heuristics for the intrinsically hard RF supertree problem on rooted trees. These heuristics use novel non-trivial algorithms for the SPR and TBR local search problems which improve on the time complexity of the best known (naïve) solutions by a factor of Θ(<it>n</it>) and Θ(<it>n</it>2) respectively (where <it>n </it>is the number of taxa, or leaves, in the supertree). We use an implementation of our new algorithms to examine the performance of the RF supertree method and compare it to matrix representation with parsimony (MRP) and the triplet supertree method using four supertree data sets. Not only did our RF heuristic provide fast estimates of RF supertrees in all data sets, but the RF supertrees also retained more of the information from the input trees (based on the RF distance) than the other supertree methods. Conclusions Our heuristics for the RF supertree problem, based on our new local search algorithms, make it possible for the first time to estimate large supertrees by directly optimizing the RF distance from rooted input trees to the supertrees. This provides a new and fast method to build accurate supertrees. RF supertrees may also be useful for estimating majority-rule(-) supertrees, which are a generalization of majority-rule consensus trees.</p

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Genome-Scale Phylogenetics: Inferring the Plant Tree of Life from 18,896 Gene Trees

Author: Bansal Mukul S.
Burleigh J. Gordon
Eulenstein Oliver
Hartmann Stefanie
Vision Todd J.
Wehe André
Publication venue
Publication date: 24/12/2010
Field of study

Phylogenetic analyses using genome-scale data sets must confront incongruence among gene trees, which in plants is exacerbated by frequent gene duplications and losses. Gene tree parsimony (GTP) is a phylogenetic optimization criterion in which a species tree that minimizes the number of gene duplications induced among a set of gene trees is selected. The run time performance of previous implementations has limited its use on large-scale data sets. We used new software that incorporates recent algorithmic advances to examine the performance of GTP on a plant data set consisting of 18,896 gene trees containing 510,922 protein sequences from 136 plant taxa (giving a combined alignment length of >2.9 million characters). The relationships inferred from the GTP analysis were largely consistent with previous large-scale studies of backbone plant phylogeny and resolved some controversial nodes. The placement of taxa that were present in few gene trees generally varied the most among GTP bootstrap replicates. Excluding these taxa either before or after the GTP analysis revealed high levels of phylogenetic support across plants. The analyses supported magnoliids sister to a eudicot + monocot clade and did not support the eurosid I and II clades. This study presents a nuclear genomic perspective on the broad-scale phylogenic relationships among plants, and it demonstrates that nuclear genes with a history of duplication and loss can be phylogenetically informative for resolving the plant tree of life

PubMed Central

Carolina Digital Repository

Improved gene tree error correction in the presence of horizontal gene transfer

Author: Bansal
Bapteste
Burleigh
Chen
Choi
David
Delsuc
Doyon
Durand
Eric J. Alm
Felsenstein
Goodman
Guindon
Górecki
Hallett
Harding
Koonin
Lerat
Lerat
Li
Libeskind-Hadas
Liu
Ma
Manolis Kellis
Mukul S. Bansal
Nguyen
Organ
Ovadia
Page
Rambaut
Rasmussen
Rutschmann
Saitou
Salter
Sanderson
Sennblad
Shimodaira
Stamatakis
Stolzer
Storm
Thornton
Tofigh
Tofigh
Vilella
Wapinski
Whelan
Wu
Yang
Yi-Chieh Wu
Yule
Zhaxybayeva
Zhaxybayeva
Åkerborg
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/11/2014
Field of study

Motivation: The accurate inference of gene trees is a necessary step in many evolutionary studies. Although the problem of accurate gene tree inference has received considerable attention, most existing methods are only applicable to gene families unaffected by horizontal gene transfer. As a result, the accurate inference of gene trees affected by horizontal gene transfer remains a largely unaddressed problem. Results: In this study, we introduce a new and highly effective method for gene tree error correction in the presence of horizontal gene transfer. Our method efficiently models horizontal gene transfers, gene duplications and losses, and uses a statistical hypothesis testing framework [Shimodaira–Hasegawa (SH) test] to balance sequence likelihood with topological information from a known species tree. Using a thorough simulation study, we show that existing phylogenetic methods yield inaccurate gene trees when applied to horizontally transferred gene families and that our method dramatically improves gene tree accuracy. We apply our method to a dataset of 11 cyanobacterial species and demonstrate the large impact of gene tree accuracy on downstream evolutionary analyses. Availability and implementation: An implementation of our method is available at http://compbio.mit.edu/treefix-dtl/National Science Foundation (U.S.) (CAREER Award 0644282)National Institutes of Health (U.S.) (RC2 HG005639)National Science Foundation (U.S.). Assembling the Tree of Life (Program) (0936234)University of Connecticu

Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss

Author: Andam
Arvestad
Bansal
Bender
Boc
Bonizzoni
Brodal
Burleigh
Charleston
Charleston
Chauve
Chen
Conow
Cormen
Csürös
David
Doyon
Durand
Eric J. Alm
Eulenstein
Goodman
Gorbunov
Górecki
Hallett
Hill
Huelsenbeck
Jin
Koonin
Libeskind-Hadas
Ma
Manolis Kellis
Merkle
Merkle
Mi
Mirkin
Mukul S. Bansal
Nakhleh
Nakhleh
Ovadia
Page
Rasmussen
Ronquist
Ronquist
Rutschmann
Sennblad
Storm
Tofigh
Tofigh
van der Heijden
Vilella
Vuillemin
Wapinski
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Motivation: Gene family evolution is driven by evolutionary events such as speciation, gene duplication, horizontal gene transfer and gene loss, and inferring these events in the evolutionary history of a given gene family is a fundamental problem in comparative and evolutionary genomics with numerous important applications. Solving this problem requires the use of a reconciliation framework, where the input consists of a gene family phylogeny and the corresponding species phylogeny, and the goal is to reconcile the two by postulating speciation, gene duplication, horizontal gene transfer and gene loss events. This reconciliation problem is referred to as duplication-transfer-loss (DTL) reconciliation and has been extensively studied in the literature. Yet, even the fastest existing algorithms for DTL reconciliation are too slow for reconciling large gene families and for use in more sophisticated applications such as gene tree or species tree reconstruction