Search CORE

14 research outputs found

An experimental study of Quartets MaxCut and other supertree methods

Author: A Ben-dor
A Dress
A Stamatakis
B Holland
B Rannala
BR Baum
C Randal Linder
CJ Creevey
D Chen
D Chen
D Thain
DL Swofford
H Bolaender
JG Burleigh
K Strimmer
KC Nixon
KS John
LR Foulds
M Bansal
M Shel Swenson
MA Ragan
MS Swenson
ORP Bininda-Emonds
ORP Bininda-Emonds
Rahul Suri
S Snir
S Snir
T Jiang
T Jiang
Tandy Warnow
V Ranwez
V Ranwez
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Supertree methods represent one of the major ways by which the Tree of Life can be estimated, but despite many recent algorithmic innovations, matrix representation with parsimony (MRP) remains the main algorithmic supertree method. Results We evaluated the performance of several supertree methods based upon the Quartets MaxCut (QMC) method of Snir and Rao and showed that two of these methods usually outperform MRP and five other supertree methods that we studied, under many realistic model conditions. However, the QMC-based methods have scalability issues that may limit their utility on large datasets. We also observed that taxon sampling impacted supertree accuracy, with poor results obtained when all of the source trees were only sparsely sampled. Finally, we showed that the popular optimality criterion of minimizing the total topological distance of the supertree to the source trees is only weakly correlated with supertree topological accuracy. Therefore evaluating supertree methods on biological datasets is problematic. Conclusions Our results show that supertree methods that improve upon MRP are possible, and that an effort should be made to produce scalable and robust implementations of the most accurate supertree methods. Also, because topological accuracy depends upon taxon sampling strategies, attempts to construct very large phylogenetic trees using supertree methods should consider the selection of source tree datasets, as well as supertree methods. Finally, since supertree topological error is only weakly correlated with the supertree's topological distance to its source trees, development and testing of supertree methods presents methodological challenges.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Texas ScholarWorks

Split-based computation of majority-rule supertrees

Author: A Kupczok
A Kupczok
AG Rodrigo
Anne Kupczok
B Holland
BR Baum
C Semple
C Semple
CA Meacham
CA Phillips
CJ Creevey
CJ Creevey
D Bryant
D Fitzpatrick
D Pisani
D Wu
DF Robinson
DH Huson
DL Swofford
E Bapteste
GU Yule
HA Ross
HT Lin
J Dong
J Dong
J Dong
JA Cotton
JL Thorley
M Kennedy
M Steel
M Wilkinson
M Wilkinson
M Wilkinson
MA Ragan
MJ Sanderson
MJ Sanderson
MS Bansal
MS Waterman
N Galtier
ORP Bininda-Emonds
P Puigbò
PA Goloboff
R Beck
RB Davis
RDM Page
T Margush
WF Doolittle
WJ Baker
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Supertree methods combine overlapping input trees into a larger supertree. Here, I consider split-based supertree methods that first extract the split information of the input trees and subsequently combine this split information into a phylogeny. Well known split-based supertree methods are matrix representation with parsimony and matrix representation with compatibility. Combining input trees on the same taxon set, as in the consensus setting, is a well-studied task and it is thus desirable to generalize consensus methods to supertree methods. Results Here, three variants of majority-rule (MR) supertrees that generalize majority-rule consensus trees are investigated. I provide simple formulas for computing the respective score for bifurcating input- and supertrees. These score computations, together with a heuristic tree search minmizing the scores, were implemented in the python program PluMiST (Plus- and Minus SuperTrees) available from <url>http://www.cibiv.at/software/plumist</url>. The different MR methods were tested by simulation and on real data sets. The search heuristic was successful in combining compatible input trees. When combining incompatible input trees, especially one variant, MR(-) supertrees, performed well. Conclusions The presented framework allows for an efficient score computation of three majority-rule supertree variants and input trees. I combined the score computation with a heuristic search over the supertree space. The implementation was tested by simulation and on real data sets and showed promising results. Especially the MR(-) variant seems to be a reasonable score for supertree reconstruction. Generalizing these computations to multifurcating trees is an open problem, which may be tackled using this framework.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

IST Austria: PubRep (Institute of Science and Technology)

DendroBlast: approximate phylogenetic trees in the absence of multiple sequence alignments

Author: A Kelil
A Novak
A Stamatakis
AJ Enright
B Dwivedi
B Holland
BL Cantarel
C Dessimoz
D Robinson
EV Kriventseva
G Jordan
G Reinert
G Talavera
G Yona
J Sukumaran
JA Lake
JD Thompson
JP Huelsenbeck
K Howe
K Katoh
K Liu
MA Suchard
Marc Robinson-Rechavi
MN Price
MN Price
Philip K. Maini
R Desper
R Hagopian
R Jothi
RC Edgar
RC Edgar
RL Tatusov
S Guindon
S Hartmann
S Kelly
S Vinga
SF Altschul
Steven Kelly
T Golubchik
TH Ogdenw
X Liu
Y Loewenstein
Publication venue: PLOS One
Publication date: 01/01/2013
Field of study

The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

FigShare

From trees to networks and back

Author: Bastkowski Sarah
Publication venue
Publication date: 01/12/2013
Field of study

The evolutionary history of a set of species is commonly represented by a phylogenetic tree. Often, however, the data contain conflicting signals, which can be better represented by a more general structure, namely a phylogenetic network. Such networks allow the display of several alternative evolutionary scenarios simultaneously but this can come at the price of complex visual representations. Using so-called circular split networks reduces this complexity, because this type of network can always be visualized in the plane without any crossing edges. These circular split networks form the core of this thesis. We construct them, use them as a search space for minimum evolution trees and explore their properties. More specifically, we present a new method, called SuperQ, to construct a circular split network summarising a collection of phylogenetic trees that have overlapping leaf sets. Then, we explore the set of phylogenetic trees associated with a �fixed circular split network, in particular using it as a search space for optimal trees. This set represents just a tiny fraction of the space of all phylogenetic trees, but we still �find trees within it that compare quite favourably with those obtained by a leading heuristic, which uses tree edit operations for searching the whole tree space. In the last part, we advance our understanding of the set of phylogenetic trees associated with a circular split network. Specifically, we investigate the size of the so-called circular tree neighbourhood for the three tree edit operations, tree bisection and reconnection (tbr), subtree prune and regraft (spr) and nearest neighbour interchange (nni)

University of East Anglia digital repository

Reconstructing phylogenetic level-1 networks from nondense binet and trinet sets

Author: AV Aho
B Holland
C Choy
C Semple
Celine Scornavacca
D Gusfield
D Huson
DH Huson
E Bapteste
F Pardi
G Cardona
H Poormohammadi
J Jansson
J Jansson
J Jansson
K Strimmer
Katharina T. Huber
KT Huber
KT Huber
KT Huber
Leo van Iersel
LJJ Iersel van
P Gambette
Taoyang Wu
Vincent Moulton
Y Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/11/2014
Field of study

Binets and trinets are phylogenetic networks with two and three leaves, respectively. Here we consider the problem of deciding if there exists a binary level-1 phylogenetic network displaying a given set T of binary binets or trinets over a taxon set X, and constructing such a network whenever it exists. We show that this is NP-hard for trinets but polynomial-time solvable for binets. Moreover, we show that the problem is still polynomial-time solvable for inputs consisting of binets and trinets as long as the cycles in the trinets have size three. Finally, we present an O(3^{|X|} poly(|X|)) time algorithm for general sets of binets and trinets. The latter two algorithms generalise to instances containing level-1 networks with arbitrarily many leaves, and thus provide some of the first supernetwork algorithms for computing networks from a set of rooted 1 phylogenetic networks

arXiv.org e-Print Archive

CiteSeerX

Crossref

TU Delft Repository

Springer - Publisher Connector

INRIA a CCSD electronic archive server

HAL Descartes

HAL-IRD

University of East Anglia digital repository

HAL-CIRAD

A Comparison of Phylogenetic Network Methods Using Computer Simulation

Author: A Rzhetsky
A Shioura
AR Templeton
AR Templeton
AR Templeton
B Holland
B Rannala
BA Schaal
BME Moret
D Posada
D Posada
David Posada
DF Robinson
DH Huson
DH Huson
DL Swofford
DM Hillis
DM Hillis
FT Bakker
G Cardona
G Jin
HJ Bandelt
I Cassens
I Cassens
Jason E. Stajich
JS Song
KA Crandall
Keith A. Crandall
L Excoffier
LL Cavalli-Sforza
M Clement
M Forster
M Pagel
M Perez-Losada
MH Schierup
MK Kuhner
N Nguyen
N Saitou
RC Griffiths
RR Hudson
RR Hudson
S Schneider
S Wain-Hobson
Steven M. Woolley
TH Jukes
W-H Li
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Background: We present a series of simulation studies that explore the relative performance of several phylogenetic network approaches (statistical parsimony, split decomposition, union of maximum parsimony trees, neighbor-net, simulated history recombination upper bound, median-joining, reduced median joining and minimum spanning network) compared to standard tree approaches, (neighbor-joining and maximum parsimony) in the presence and absence of recombination. Principal Findings: In the absence of recombination, all methods recovered the correct topology and branch lengths nearly all of the time when the substitution rate was low, except for minimum spanning networks, which did considerably worse. At a higher substitution rate, maximum parsimony and union of maximum parsimony trees were the most accurate. With recombination, the ability to infer the correct topology was halved for all methods and no method could accurately estimate branch lengths. Conclusions: Our results highlight the need for more accurate phylogenetic network methods and the importance of detecting and accounting for recombination in phylogenetic studies. Furthermore, we provide useful information for choosing a network algorithm and a framework in which to evaluate improvements to existing methods and nove

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker