Search CORE

1,786 research outputs found

MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space

Author: Aaron Darling
Altekar
Ayres
Bret Larget
Daniel L. Ayres
Drummond
Edwards
Fredrik Ronquist
Gelman
Gernhard
Goldman
Huelsenbeck
Huelsenbeck
Huelsenbeck
Höhna
Höhna
John P. Huelsenbeck
Lakner
Larget
Lartillot
Lepage
Liang Liu
Liu
Marc A. Suchard
Mau
Mau
Maxim Teslenko
Newton
Paul van der Mark
Posada
Posada
Rannala
Roberts
Ronquist
Ronquist
Ronquist
Sebastian Höhna
Stadler
Suchard
Thorne
Xie
Yang
Publication venue: Oxford University Press
Publication date: 01/05/2012
Field of study

Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software

Crossref

OPUS - University of Technology Sydney

PubMed Central

eScholarship - University of California

BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

Author: Ayres Daniel L.
Beerli Peter
Cummings Michael P.
Darling Aaron
Holder Mark T.
Huelsenbeck John P.
Lewis Paul O.
Rambaut Andrew
Ronquist Fredrik
Swofford David L.
Zwickl Derrick J.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/04/2014
Field of study

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.This work was supported by the National Science Foundation [grant numbers DBI-0755048, DEB-0732920, DEB-1036448, DMS-0931642, EF-0331495, EF-0905606, EF-0949453]; the National Institutes of Health [grant numbers R01-HG006139, R01-GM037841, R01-GM078985, R01-GM086887, R01-NS063897]; the Biotechnology and Biological Sciences Research Council [grant number BB/H011285/1]; the Wellcome Trust [grant number WT092807MA]; and Google Summer of Code

KU ScholarWorks (Univ. of Kansas)

Ancient DNA Provides New Insights into the Evolutionary History of New Zealand's Extinct Giant Eagle

Author: Alan Cooper
Arredondo
Beth Shapiro
Bunce
Cooper
David Penny
Freckleton
Heather R. L Lerner
Holdaway
Huelsenbeck
Ian Barnes
Krajewski
Marta Szulkin
Michael Bunce
Palkovacs
Paxinos
Richard N Holdaway
Shapiro
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

Prior to human settlement 700 years ago New Zealand had no terrestrial mammals—apart from three species of bats—instead, approximately 250 avian species dominated the ecosystem. At the top of the food chain was the extinct Haast's eagle, Harpagornis moorei. H. moorei (10–15 kg; 2–3 m wingspan) was 30%–40% heavier than the largest extant eagle (the harpy eagle, Harpia harpyja), and hunted moa up to 15 times its weight. In a dramatic example of morphological plasticity and rapid size increase, we show that the H. moorei was very closely related to one of the world's smallest extant eagles, which is one-tenth its mass. This spectacular evolutionary change illustrates the potential speed of size alteration within lineages of vertebrates, especially in island ecosystems

Royal Holloway Research Online

Crossref

Adelaide Research & Scholarship

Directory of Open Access Journals

Royal Holloway - Pure

PubMed Central

Research Repository

The Francis Crick Institute

An efficient and extensible approach for compressing phylogenetic trees

Author: AD Molin
DE Soltis
HE Williams
JP Huelsenbeck
LA Lewis
N Amenta
PA Goloboff
RS Boyer
SJ Matthews
SJ Sul
Suzanne J Matthews
Tiffani L Williams
WA Hunt Jr
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Biologists require new algorithms to efficiently compress and store their large collections of phylogenetic trees. TreeZip is a novel method for compressing phylogenetic trees. Recently, we extended our TreeZip algorithm to support branch lengths and show how it can be used to extract sets of trees of interest quickly. The key advantage of TreeZip over standard compression methods like 7zip is its ability to interpret and compress tree collections semantically, making it immune to branch rotations and allowing key operations (such calculating a consensus tree) to be performed quickly and without a loss of space savings. On unweighted phylogenetic trees, TreeZip is able to compress Newick files in excess of 98%. On weighted phylogenetic trees, TreeZip is able to compress a Newick file by at least 73%. TreeZip can be combined with 7zip with little overhead, allowing space savings in excess of 99 % (unweighted) and 92%(weighted). Unlike TreeZip, 7zip is not immune to branch rotations, and performs worse as the level of variability in the Newick string representation increases. Finally, since the TreeZip compressed text (TRZ) file contains all the semantic information in a collection of trees, we can easily filter and decompress a subset of trees of interest (such as the set of unique trees), or build the resulting consensus tree in a matter of seconds. We also show the ease of which set operations can be performed on TRZ files, at speeds quicker than those performed on Newick or 7zip compressed Newick files, and without loss of space savings. TreeZip is an efficient approach for compressing large collections of phylogenetic trees. The semantic and compact nature of the TRZ file allow it to be operated upon directly and quickly, without a need to decompress the original Newick file. We believe that TreeZip will be vital for compressing and archiving trees in the biological community.

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

OAKTrust Digital Repository (Texas A&M Univ)

Selective Constraints on Amino Acids Estimated by a Mechanistic Codon Substitution Model with Multiple Nucleotide Changes

Author: A Doron-Faigenboim
A Schneider
AL Halpern
AR Kinjo
C Kosiol
Darren Martin
DT Jones
G Bazykin
GC Conant
H Akaike
I Keller
J Adachi
J Adachi
JP Huelsenbeck
K Tamura
L Jin
M Anisimova
M Averof
M Hasegawa
M Kimura
MA Larkin
MO Dayhoff
MW Dimmic
N Goldman
N Rodrigue
N Takahata
NGC Smith
R Grantham
S Guindon
S Miyazawa
S Whelan
S Whelan
S Whelan
Sanzo Miyazawa
SC Choi
SQ Le
SV Muse
T Miyata
T Miyata
TK Seo
TK Seo
W Delport
W Delport
Z Yang
Z Yang
Z Yang
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/03/2011
Field of study

Empirical substitution matrices represent the average tendencies of substitutions over various protein families by sacrificing gene-level resolution. We develop a codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene. First, selective constraints averaged over proteins are estimated by maximizing the likelihood of each 1-PAM matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution matrices. Then, selective constraints specific to given proteins are approximated as a linear function of those estimated from the empirical substitution matrices. Akaike information criterion (AIC) values indicate that a model allowing multiple nucleotide changes fits the empirical substitution matrices significantly better. Also, the ML estimates of transition-transversion bias obtained from these empirical matrices are not so large as previously estimated. The selective constraints are characteristic of proteins rather than species. However, their relative strengths among amino acid pairs can be approximated not to depend very much on protein families but amino acid pairs, because the present model, in which selective constraints are approximated to be a linear function of those estimated from the JTT/WAG/LG/KHG matrices, can provide a good fit to other empirical substitution matrices including cpREV for chloroplast proteins and mtREV for vertebrate mitochondrial proteins. The present codon-based model with the ML estimates of selective constraints and with adjustable mutation rates of nucleotide would be useful as a simple substitution model in ML and Bayesian inferences of molecular phylogenetic trees, and enables us to obtain biologically meaningful information at both nucleotide and amino acid levels from codon and protein sequences.Comment: Table 9 in this article includes corrections for errata in the Table 9 published in 10.1371/journal.pone.0017244. Supporting information is attached at the end of the article, and a computer-readable dataset of the ML estimates of selective constraints is available from 10.1371/journal.pone.001724

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Data incongruence and the problem of avian louse phylogeny

Recent studies based on different types of data (i.e. morphological and molecular) have supported conflicting phylogenies for the genera of avian feather lice (Ischnocera: Phthiraptera). We analyse new and published data from morphology and from mitochondrial (12S rRNA and COI) and nuclear (EF1-) genes to explore the sources of this incongruence and explain these conflicts. Character convergence, multiple substitutions at high divergences, and ancient radiation over a short period of time have contributed to the problem of resolving louse phylogeny with the data currently available. We show that apparent incongruence between the molecular datasets is largely attributable to rate variation and nonstationarity of base composition. In contrast, highly significant character incongruence leads to topological incongruence between the molecular and morphological data. We consider ways in which biases in the sequence data could be misleading, using several maximum likelihood models and LogDet corrections. The hierarchical structure of the data is explored using likelihood mapping and SplitsTree methods. Ultimately, we concede there is strong discordance between the molecular and morphological data and apply the conditional combination approach in this case. We conclude that higher level phylogenetic relationships within avian Ischnocera remain extremely problematic. However, consensus between datasets is beginning to converge on a stable phylogeny for avian lice, at and below the familial rank

Crossref

Enlighten

BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

Author: Aaron Darling
Andrew Rambaut
Daniel L. Ayres
David L. Swofford
Derrick J. Zwickl
Drummond
Felsenstein
Fredrik Ronquist
John P. Huelsenbeck
Marc A. Suchard
Mark T. Holder
Michael P. Cummings
Paul O. Lewis
Peter Beerli
Regier
Ronquist
Suchard
Swofford
Zwickl
Publication venue: Oxford University Press
Publication date: 01/10/2011
Field of study

CiteSeerX

Crossref

OPUS - University of Technology Sydney

KU ScholarWorks (Univ. of Kansas)

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

Ecdysozoan mitogenomics: evidence for a common origin of the legged invertebrates, the Panarthropoda

Author: Abascal
Adachi
Aguinaldo
Blanquart
Boore
Boore
Bourlat
Bourlat
Brinkmann
Brinkmann
Brinkmann
Cameron
Carapelli
Castoe
Curole
Davide Pisani
Delsuc
Dennis V. Lavrov
Dianne Gleeson
Dohle
Dunn
Edgar
Edgecombe
Ehsan Kayal
Felsenstein
Fendt
Foster
Friedrich
Gibson
Harzsch
Hassanin
Hassanin
Hejnol
Helfenbein
Huelsenbeck
Jeffrey L. Boore
Jennifer Daub
Jones
Lanave
Lartillot
Lartillot
Lartillot
Lartillot
Laslett
Lavrov
Lavrov
Lowe
Mallatt
Mark Blaxter
Masta
Maximilian J. Telford
Mayer
Mayer
Mooers
Mwinyi
Nardi
Nielsen
Omar Rota-Stabelli
Perna
Peterson
Pisani
Pisani
Pisani
Podsiadlowski
Regier
Regier
Roeding
Rota-Stabelli
Rota-Stabelli
Ryu
Saccone
Saccone
Scholz
Sperling
Stamatakis
Telford
Webster
Yang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2010
Field of study

Ecdysozoa is the recently recognized clade of molting animals that comprises the vast majority of extant animal species and the most important invertebrate model organisms—the fruit fly and the nematode worm. Evolutionary relationships within the ecdysozoans remain, however, unresolved, impairing the correct interpretation of comparative genomic studies. In particular, the affinities of the three Panarthropoda phyla (Arthropoda, Onychophora, and Tardigrada) and the position of Myriapoda within Arthropoda (Mandibulata vs. Myriochelata hypothesis) are among the most contentious issues in animal phylogenetics. To elucidate these relationships, we have determined and analyzed complete or nearly complete mitochondrial genome sequences of two Tardigrada, Hypsibius dujardini and Thulinia sp. (the first genomes to date for this phylum); one Priapulida, Halicryptus spinulosus; and two Onychophora, Peripatoides sp. and Epiperipatus biolleyi; and a partial mitochondrial genome sequence of the Onychophora Euperipatoides kanagrensis. Tardigrada mitochondrial genomes resemble those of the arthropods in term of the gene order and strand asymmetry, whereas Onychophora genomes are characterized by numerous gene order rearrangements and strand asymmetry variations. In addition, Onychophora genomes are extremely enriched in A and T nucleotides, whereas Priapulida and Tardigrada are more balanced. Phylogenetic analyses based on concatenated amino acid coding sequences support a monophyletic origin of the Ecdysozoa and the position of Priapulida as the sister group of a monophyletic Panarthropoda (Tardigrada plus Onychophora plus Arthropoda). The position of Tardigrada is more problematic, most likely because of long branch attraction (LBA). However, experiments designed to reduce LBA suggest that the most likely placement of Tardigrada is as a sister group of Onychophora. The same analyses also recover monophyly of traditionally recognized arthropod lineages such as Arachnida and of the highly debated clade Mandibulata

Crossref

University of Canberra Research Repository

PubMed Central

UCL Discovery

Edinburgh Research Explorer

Explore Bristol Research

Taxon ordering in phylogenetic trees by means of evolutionary algorithms

Author: A Tettamanzi
AE Eiben
C Cotta
C Darwin
CT Davis
E Paradis
EO Wiley
F Cerutti
F Ronquist
Francesco Cerutti
J Barthélemy
JP Huelsenbeck
L Bertolotti
L Bertolotti
Luigi Bertolotti
M Nei
Mario Giacobini
P Moscato
R Lanciotti
RDM Page
Team RDC
Tony L Goldberg
W Maddison
W Maddison
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background In in a typical "left-to-right" phylogenetic tree, the vertical order of taxa is meaningless, as only the branch path between them reflects their degree of similarity. To make unresolved trees more informative, here we propose an innovative Evolutionary Algorithm (EA) method to search the best graphical representation of unresolved trees, in order to give a biological meaning to the vertical order of taxa. Methods Starting from a West Nile virus phylogenetic tree, in a (1 + 1)-EA we evolved it by randomly rotating the internal nodes and selecting the tree with better fitness every generation. The fitness is a sum of genetic distances between the considered taxon and the <it>r </it>(radius) next taxa. After having set the radius to the best performance, we evolved the trees with (<it>λ </it>+ <it>μ</it>)-EAs to study the influence of population on the algorithm. Results The (1 + 1)-EA consistently outperformed a random search, and better results were obtained setting the radius to 8. The (<it>λ </it>+ <it>μ</it>)-EAs performed as well as the (1 + 1), except the larger population (1000 + 1000). Conclusions The trees after the evolution showed an improvement both of the fitness (based on a genetic distance matrix, then close taxa are actually genetically close), and of the biological interpretation. Samples collected in the same state or year moved close each other, making the tree easier to interpret. Biological relationships between samples are also easier to observe.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Institutional Research Information System University of Turin