Search CORE

325 research outputs found

Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

Author: A Stamatakis
AR Lemmon
AWF Edwards
B Kolaczkowski
B Kolaczkowski
B Kolaczkowski
B Kolaczkowski
BP Carlin
Bryan Kolaczkowski
D Hillis
D Penny
DJ Taylor
DL Swofford
DM Hillis
DM Hillis
DM Hillis
E Mossel
E Susko
F Delsuc
F Ronquist
FE Anderson
H Akaike
H Brinkmann
J Bergsten
J Felsenstein
J Felsenstein
Joseph W. Thornton
JP Huelsenbeck
JP Huelsenbeck
JP Huelsenbeck
JP Huelsenbeck
JP Huelsenbeck
JS Rogers
JS Rogers
JT Chang
K Misawa
KG Karol
M Alfaro
M Anisimova
M Holder
M Pagel
M Spencer
ME Alfaro
MK Kuhner
MP Cummings
MP Simmons
N Saitou
P Erixon
P Lewis
P Lopez
PO Lewis
RC Jeffrey
S Guindon
Wayne Delport
WJ Bruno
WJ Murphy
Y Inagaki
Y Suzuki
Z Yang
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Morphological and molecular differentiation of genus Corbicula suggests that two species are sympatrically distributed in Datong Lake in the Central Yangtze River Basin

Author: A Komaru
A Komaru
A Komura
AD Qiu
AY Karatayev
BL Choe
BS Morton
BS Morton
BY Huang
CL Counts
DL Li
DL Li
DL Li
DM Hillis
E Renard
J Felsenstein
J Rozas
JC Britton
JK Park
JK Park
K Tamura
LM Pigneur
M Pfenninger
NB Liao
O Folmer
OK Kwon
R Araujo
R Sousa
R Sousa
S Siripattrawan
S Tchange
S Tchange
SCM Tsoi
SF Altschul
SJ Houki
SM Hedtke
T Lee
V Kijviriya
YY Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A Comparison of Phylogenetic Network Methods Using Computer Simulation

Author: A Rzhetsky
A Shioura
AR Templeton
AR Templeton
AR Templeton
B Holland
B Rannala
BA Schaal
BME Moret
D Posada
D Posada
David Posada
DF Robinson
DH Huson
DH Huson
DL Swofford
DM Hillis
DM Hillis
FT Bakker
G Cardona
G Jin
HJ Bandelt
I Cassens
I Cassens
Jason E. Stajich
JS Song
KA Crandall
Keith A. Crandall
L Excoffier
LL Cavalli-Sforza
M Clement
M Forster
M Pagel
M Perez-Losada
MH Schierup
MK Kuhner
N Nguyen
N Saitou
RC Griffiths
RR Hudson
RR Hudson
S Schneider
S Wain-Hobson
Steven M. Woolley
TH Jukes
W-H Li
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Background: We present a series of simulation studies that explore the relative performance of several phylogenetic network approaches (statistical parsimony, split decomposition, union of maximum parsimony trees, neighbor-net, simulated history recombination upper bound, median-joining, reduced median joining and minimum spanning network) compared to standard tree approaches, (neighbor-joining and maximum parsimony) in the presence and absence of recombination. Principal Findings: In the absence of recombination, all methods recovered the correct topology and branch lengths nearly all of the time when the substitution rate was low, except for minimum spanning networks, which did considerably worse. At a higher substitution rate, maximum parsimony and union of maximum parsimony trees were the most accurate. With recombination, the ability to infer the correct topology was halved for all methods and no method could accurately estimate branch lengths. Conclusions: Our results highlight the need for more accurate phylogenetic network methods and the importance of detecting and accounting for recombination in phylogenetic studies. Furthermore, we provide useful information for choosing a network algorithm and a framework in which to evaluate improvements to existing methods and nove

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Taxonomic Reliability of DNA Sequences in Public Sequence Databases: A Fungal Perspective

Author: A Izzo
A Schüßler
C Moritz
Cecile Fairhead
CP Meyer
D Steinke
DA Benson
DL Hawksworth
DM Hillis
DS Hibbett
Erik Kristiansson
FM Cohan
I Álvarez
JP Clapp
Karl-Henrik Larsson
Kessy Abarenkov
M Blaxter
M Hajibabaei
Martin Ryberg
MC Ebach
PD Bridge
PDN Hebert
R. Henrik Nilsson
RH Nilsson
RH Nilsson
SF Altschul
TD Bruns
TJ White
TR Horton
U Kõljalg
Urmas Kõljalg
V Savolainen
Publication venue: Public Library of Science
Publication date: 01/01/2006
Field of study

BACKGROUND: DNA sequences are increasingly seen as one of the primary information sources for species identification in many organism groups. Such approaches, popularly known as barcoding, are underpinned by the assumption that the reference databases used for comparison are sufficiently complete and feature correctly and informatively annotated entries. METHODOLOGY/PRINCIPAL FINDINGS: The present study uses a large set of fungal DNA sequences from the inclusive International Nucleotide Sequence Database to show that the taxon sampling of fungi is far from complete, that about 20% of the entries may be incorrectly identified to species level, and that the majority of entries lack descriptive and up-to-date annotations. CONCLUSIONS: The problems with taxonomic reliability and insufficient annotations in public DNA repositories form a tangible obstacle to sequence-based species identification, and it is manifest that the greatest challenges to biological barcoding will be of taxonomical, rather than technical, nature

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Chalmers Research

Chalmers Publication Library

Resurrection of an ancestral 5S rRNA

Author: BS Schuwirth
C Hsiao
DE Dykhuizen
DL Hartl
DM Hillis
DM Wallace
EA Gaucher
EA Gaucher
GE Fox
GE Fox
George E Fox
GN Godson
J Brosius
J Felsenstein
JM Thomson
K Bokov
KOF Hedenstierna
L Chao
M Gullberg
M Nayar
M Szymański
MJ Belousoff
MT MacDonell
PD Williams
Qing Lu
RE Lenski
RE Lenski
RF Service
SA Benner
SA Benner
TM Jermann
YH Lee
YH Lee
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background In addition to providing phylogenetic relationships, tree making procedures such as parsimony and maximum likelihood can make specific predictions of actual historical sequences. Resurrection of such sequences can be used to understand early events in evolution. In the case of RNA, the nature of parsimony is such that when applied to multiple RNA sequences it typically predicts ancestral sequences that satisfy the base pairing constraints associated with secondary structure. The case for such sequences being actual ancestors is greatly improved, if they can be shown to be biologically functional. Results A unique common ancestral sequence of 28 <it>Vibrio </it>5S ribosomal RNA sequences predicted by parsimony was resurrected and found to be functional in the context of the <it>E. coli </it>cellular environment. The functionality of various point variants and intermediates that were constructed as part of the resurrection were examined in detail. When separately introduced the changes at single stranded positions and individual double variants at base-paired positions were also viable. An additional double variant was examined at a different base-paired position and it was also valid. Conclusions The results show that at least in the case of the 5S rRNAs considered here, ancestors predicted by parsimony are likely to be realistic when the prediction is not overly influenced by single outliers. It is especially noteworthy that the phenotype of the predicted ancestors could be anticipated as a cumulative consequence of the phenotypes of the individual variants that comprised them. Thus, point mutation data is potentially useful in evaluating the reasonableness of ancestral sequences predicted by parsimony or other methods. The results also suggest that in the absence of significant tertiary structure constraints double variants that preserve pairing in stem regions will typically be accepted. Overall, the results suggest that it will be feasible to resurrect additional meaningful 5S rRNA ancestors as well as ancestral sequences of many different types of RNA.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

On the use of cartographic projections in visualizing phylo-genetic tree space

Author: A Clark
A Hultman
A Kupczok
A Stamatakis
B Allen
B Chor
B Herring
B Jenkins
C Sing
D Gusfield
D Hillis
DJ Zwickl
DL Swofford
F Ronquist
G Ganapathy
H Carroll
J Keith
J Thompson
K Crandall
Kenneth Sundberg
L Bugayevskiy
LJ Billera
M Chase
M Waterman
Mark Clement
N Amenta
N Amenta
N Pattengale
Quinn Snell
R DeSalle
R Meier
S Guindon
W Basalaj
W Day
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Phylogenetic analysis is becoming an increasingly important tool for biological research. Applications include epidemiological studies, drug development, and evolutionary analysis. Phylogenetic search is a known NP-Hard problem. The size of the data sets which can be analyzed is limited by the exponential growth in the number of trees that must be considered as the problem size increases. A better understanding of the problem space could lead to better methods, which in turn could lead to the feasible analysis of more data sets. We present a definition of phylogenetic tree space and a visualization of this space that shows significant exploitable structure. This structure can be used to develop search methods capable of handling much larger data sets

Crossref

Springer - Publisher Connector

PubMed Central

Assessing the Value of DNA Barcodes for Molecular Phylogenetics: Effect of Increased Taxon Sampling in Lepidoptera

Author: A Mitchell
A Mitchell
A Purvis
A Zwick
AD Warren
AH Wortley
Ahmed Moustafa
AJ Phillips
AV Brower
AVZ Brower
AY Kawahara
B Kolaczkowski
BA Salisbury
C Simon
CO Webb
D Tautz
DD Pollock
DL Swofford
DM Hillis
DM Hillis
E Bazin
EJ Feil
J Tamura K Dudley
JC Regier
JC Regier
JJ Wilson
John James Wilson
L Kaila
M Fibiger
M Hajibabaei
M Hajibabaei
M Kallersjo
M Mutanen
MG Pogue
MJ Scoble
MT Monaghan
N Wahlberg
N Wahlberg
NM Franz
NP Kristensen
PA Goloboff
PA Goloboff
PDN Hebert
PDN Hebert
R DeSalle
R DeSalle
R Floyd
RI Vane-Wright
S Ratnasingham
SM Hedtke
SR Bucheli
TA Hall
V Savolainen
WJ Kress
ZH Yang
Publication venue: Public Library of Science
Publication date: 09/09/2011
Field of study

BACKGROUND: A common perception is that DNA barcode datamatrices have limited phylogenetic signal due to the small number of characters available per taxon. However, another school of thought suggests that the massively increased taxon sampling afforded through the use of DNA barcodes may considerably increase the phylogenetic signal present in a datamatrix. Here I test this hypothesis using a large dataset of macrolepidopteran DNA barcodes. METHODOLOGY/PRINCIPAL FINDINGS: Taxon sampling was systematically increased in datamatrices containing macrolepidopteran DNA barcodes. Sixteen family groups were designated as concordance groups and two quantitative measures; the taxon consistency index and the taxon retention index, were used to assess any changes in phylogenetic signal as a result of the increase in taxon sampling. DNA barcodes alone, even with maximal taxon sampling (500 species per family), were not sufficient to reconstruct monophyly of families and increased taxon sampling generally increased the number of clades formed per family. However, the scores indicated a similar level of taxon retention (species from a family clustering together) in the cladograms as the number of species included in the datamatrix was increased, suggesting substantial phylogenetic signal below the 'family' branch. CONCLUSIONS/SIGNIFICANCE: The development of supermatrix, supertree or constrained tree approaches could enable the exploitation of the massive taxon sampling afforded through DNA barcodes for phylogenetics, connecting the twigs resolved by barcodes to the deep branches resolved through phylogenomics

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UM Digital Repository

Additions to the Mycosphaerella complex

Author: A Aptroot
A Rambaut
CL Schoch
DF Farr
DJ Soares
DL Swofford
DM Hillis
FL Stevens
GJM Verkley
GS Hoog de
I Carbone
JC Batzer
JE Taylor
K Bensch
K O’Donnell
K Schubert
M Corlett
MC Pretorius
MN Cortinas
PA Barber
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
PW Crous
R Vilgalys
RDM Page
RW Rayner
TJ White
U Braun
Publication venue: Nationaal Herbarium Nederland & Centraallbureau voor Schimmelcultures
Publication date: 01/01/2011
Field of study

Species in the present study were compared based on their morphology, growth characteristics in culture, and DNA sequences of the nuclear ribosomal RNA gene operon (including ITS1, ITS2, 5.8S nrDNA and the first 900 bp of the 28S nrDNA) for all species and partial actin and translation elongation factor 1-alpha gene sequences for Cladosporium species. New species of Mycosphaerella (Mycosphaerellaceae) introduced in this study include M. cerastiicola (on Cerastium semidecandrum, The Netherlands), and M. etlingerae (on Etlingera elatior, Hawaii). Mycosphaerella holualoana is newly reported on Hedychium coronarium (Hawaii). Epitypes are also designated for Hendersonia persooniae, the basionym of Camarosporula persooniae, and for Sphaerella agapanthi, the basionym of Teratosphaeria agapanthi comb. nov. (Teratosphaeriaceae) on Agapathus umbellatus from South Africa. The latter pathogen is also newly recorded from A. umbellatus in Europe (Portugal). Furthermore, two sexual species of Cladosporium (Davidiellaceae) are described, namely C. grevilleae (on Grevillea sp., Australia), and C. silenes (on Silene maritima, UK). Finally, the phylogenetic position of two genera are newly confirmed, namely Camarosporula (based on C. persooniae, teleomorph Anthracostroma persooniae), which is a leaf pathogen of Persoonia spp. in Australia, belongs to the Teratosphaeriaceae, and Sphaerulina (based on S. myriadea), which occurs on leaves of Fagaceae (Carpinus, Castanopsis, Fagus, Quercus), and belongs to the Mycosphaerellaceae

Crossref

PubMed Central

Wageningen University & Research Publications

Large-Scale Phylogenetic Analysis of Emerging Infectious Diseases

Author: A Moilanen
A Phillips
A Tehler
AR Lemmon
B Budowle
B Chang
B Grenfell
B Rannala
B Rannala
BD Redelings
BE Martina
C Ceron
C Scholtissek
D Earn
D Franz
D Janies
D Janies
D Morrison
D Pol
D Sankoff
D Searls
DJ Zwickl
DL Swofford
DL Swofford
DL Swofford
DM Hillis
DM Hillis
E Ghedin
E Holmes
E Ukkonen
EM Rubin
G Laver
H Song
J Antonovics
J Felsenstein
J Felsenstein
J Felsenstein
J Huelsenbeck
J Plotkin
J Silvertown
J Thornton
JD Thompson
JK Taubenberger
JK Taubenberger
JK Taubenberger
JK Taubenberger
JL Thorne
JP Carulli
JS Farris
JS Farris
JS Farris
K Li
K Li
K Ungchusak
KC Nixon
KC Nixon
KP White
L Wang
L Watrous
LA Salter
LH Taylor
LR Foulds
M Gammelin
M Gibbs
M Koopmans
M Metzker
MA Charleston
MA Marra
MD Hendy
MJ Brauer
N Saitou
NM Ferguson
NM Ferguson
P Palese
PA Goloboff
PA Goloboff
PA Rota
PO Lewis
Q Wang
R Fleissner
RG Webster
RM Bush
RM Bush
RM Bush
RS Ross
S Lau
S Li
S Morse
S Poe
T Fanning
T Grant
T Ksiazek
The Chinese SARS Molecular Epidemiology Consortium
W Hennig
W Li
W Wheeler
W Wheeler
WC Wheeler
WC Wheeler
WM Fitch
WM Fitch
WM Fitch
Y Guan
Y Guan
Y Lin
Y Suzuki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Microorganisms that cause infectious diseases present critical issues of national security, public health, and economic welfare. For example, in recent years, highly pathogenic strains of avian influenza have emerged in Asia, spread through Eastern Europe and threaten to become pandemic. As demonstrated by the coordinated response to Severe Acute Respiratory Syndrome (SARS) and influenza, agents of infectious disease are being addressed via large-scale genomic sequencing. The goal of genomic sequencing projects are to rapidly put large amounts of data in the public domain to accelerate research on disease surveillance, treatment, and prevention. However, our ability to derive information from large comparative genomic datasets lags far behind acquisition. Here we review the computational challenges of comparative genomic analyses, specifically sequence alignment and reconstruction of phylogenetic trees. We present novel analytical results on from two important infectious diseases, Severe Acute Respiratory Syndrome (SARS) and influenza.SARS and influenza have similarities and important differences both as biological and comparative genomic analysis problems. Influenza viruses (Orthymxyoviridae) are RNA based. Current evidence indicates that influenza viruses originate in aquatic birds from wild populations. Influenza has been studied for decades via well-coordinated international efforts. These efforts center on surveillance via antibody characterization of the hemagglutinin (HA) and neuraminidase (N) proteins of the circulating strains to inform vaccine design. However we still do not have a clear understanding of: 1) various transmission pathways such as the role of intermediate hosts such as swine and domestic birds and 2) the key mutation and genomic recombination events that underlie periodic pandemics of influenza. In the past 30 years, sequence data from HA and N loci has become an important data type. In the past year, full genomic data has become prominent. These data present exciting opportunities to address unanswered questions in influenza pandemics.SARS is caused by a previously unrecognized lineage of coronavirus, SARS-CoV, which like influenza has an RNA based genome. Although SARS-CoV is widely believed to have originated in animals there remains disagreement over the candidate animal source that lead to the original outbreak of SARS. In contrast to the long history of the study of influenza, SARS was only recognized in late 2002 and the virus that causes SARS has been documented primarily by genomic sequencing.In the past, most studies of influenza were performed on a limited number of isolates and genes suited to a particular problem. Major goals in science today are to understand emerging diseases in broad geographic, environmental, societal, biological, and genomic contexts. Synthesizing diverse information brought together by various researchers is important to find out what can be done to prevent future outbreaks {JON03}. Thus comprehensive means to organize and analyze large amounts of diverse information are critical. For example, the relationships of isolates and patterns of genomic change observed in large datasets might not be consistent with hypotheses formed on partial data. Moreover when researchers rely on partial datasets, they restrict the range of possible discoveries.Phylogenetics is well suited to the complex task of understanding emerging infectious disease. Phylogenetic analyses can test many hypotheses by comparing diverse isolates collected from various hosts, environments, and points in time and organizing these data into various evolutionary scenarios. The products of a phylogenetic analysis are a graphical tree of ancestor-descendent relationships and an inferred summary of mutations, recombination events, host shifts, geographic, and temporal spread of the viruses. However, this synthesis comes at a price. The cost of computation of phylogenetic analysis expands combinatorially as the number of isolates considered increases. Thus, large datasets like those currently produced are commonly considered intractable. We address this problem with synergistic development of heuristics tree search strategies and parallel computing.Fil: Janies, D.. Ohio State University; Estados UnidosFil: Pol, Diego. Ohio State University; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Recommended from our members

Shading Beats Binocular Disparity in Depth from Luminance Gradients: Evidence against a Maximum Likelihood Principle for Cue Combination

Author: A Johnston
A Johnston
A. J. Schofield
AJ van Doorn
AP Pentland
AP Pentland
BG Khang
BKP Horn
BKP Horn
BKP Horn
C Christou
CC Chen
CC Chen
Chien-Chung Chen
Christopher William Tyler
CW Tyler
CW Tyler
CW Tyler
D Brewster
D Brewster
D Marr
DH Brainard
DL MacAdam
E Mingolla
FD Reichel
H Hill
HE Gerhard
HH Bülthoff
J Sun
JF Norman
JJ Clark
JJ Koenderink
JJ Koenderink
JM Hillis
JT Todd
LT Likova
M D’Zmura
M Wright
MS Langer
MS Langer
P Mamassian
P Sun
PC Doorschot
PG Lovell
RL Gregory
RL Gregory
Samuel G. Solomon
VS Ramachandran
VS Ramachandran
YL Lee
Z Pizlo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 10/08/2015
Field of study

Perceived depth is conveyed by multiple cues, including binocular disparity and luminance shading. Depth perception from luminance shading information depends on the perceptual assumption for the incident light, which has been shown to default to a diffuse illumination assumption. We focus on the case of sinusoidally corrugated surfaces to ask how shading and disparity cues combine defined by the joint luminance gradients and intrinsic disparity modulation that would occur in viewing the physical corrugation of a uniform surface under diffuse illumination. Such surfaces were simulated with a sinusoidal luminance modulation (0.26 or 1.8 cy/deg, contrast 20%-80%) modulated either in-phase or in opposite phase with a sinusoidal disparity of the same corrugation frequency, with disparity amplitudes ranging from 0’-20’. The observers’ task was to adjust the binocular disparity of a comparison random-dot stereogram surface to match the perceived depth of the joint luminance/disparitymodulated corrugation target. Regardless of target spatial frequency, the perceived target depth increased with the luminance contrast and depended on luminance phase but was largely unaffected by the luminance disparity modulation. These results validate the idea that human observers can use the diffuse illumination assumption to perceive depth from luminance gradients alone without making an assumption of light direction. For depth judgments with combined cues, the observers gave much greater weighting to the luminance shading than to the disparity modulation of the targets. The results were not well-fit by a Bayesian cue-combination model weighted in proportion to the variance of the measurements for each cue in isolation. Instead, they suggest that the visual system uses disjunctive mechanisms to process these two types of information rather than combining them according to their likelihood ratios

City Research Online

Crossref

Directory of Open Access Journals

PubMed Central

FigShare