6,169 research outputs found
Simultaneous reconstruction of evolutionary history and epidemiological dynamics from viral sequences with the birth-death SIR model
The evolution of RNA viruses such as HIV, Hepatitis C and Influenza virus
occurs so rapidly that the viruses' genomes contain information on past
ecological dynamics. Hence, we develop a phylodynamic method that enables the
joint estimation of epidemiological parameters and phylogenetic history. Based
on a compartmental susceptible-infected-removed (SIR) model, this method
provides separate information on incidence and prevalence of infections.
Detailed information on the interaction of host population dynamics and
evolutionary history can inform decisions on how to contain or entirely avoid
disease outbreaks.
We apply our Birth-Death SIR method (BDSIR) to two viral data sets. First,
five human immunodeficiency virus type 1 clusters sampled in the United Kingdom
between 1999 and 2003 are analyzed. The estimated basic reproduction ratios
range from 1.9 to 3.2 among the clusters. All clusters show a decline in the
growth rate of the local epidemic in the middle or end of the 90's.
The analysis of a hepatitis C virus (HCV) genotype 2c data set shows that the
local epidemic in the C\'ordoban city Cruz del Eje originated around 1906
(median), coinciding with an immigration wave from Europe to central Argentina
that dates from 1880--1920. The estimated time of epidemic peak is around 1970.Comment: Journal link:
http://rsif.royalsocietypublishing.org/content/11/94/20131106.ful
Inferring Species Trees from Incongruent Multi-Copy Gene Trees Using the Robinson-Foulds Distance
We present a new method for inferring species trees from multi-copy gene
trees. Our method is based on a generalization of the Robinson-Foulds (RF)
distance to multi-labeled trees (mul-trees), i.e., gene trees in which multiple
leaves can have the same label. Unlike most previous phylogenetic methods using
gene trees, this method does not assume that gene tree incongruence is caused
by a single, specific biological process, such as gene duplication and loss,
deep coalescence, or lateral gene transfer. We prove that it is NP-hard to
compute the RF distance between two mul-trees, but it is easy to calculate the
generalized RF distance between a mul-tree and a singly-labeled tree. Motivated
by this observation, we formulate the RF supertree problem for mul-trees
(MulRF), which takes a collection of mul-trees and constructs a species tree
that minimizes the total RF distance from the input mul-trees. We present a
fast heuristic algorithm for the MulRF supertree problem. Simulation
experiments demonstrate that the MulRF method produces more accurate species
trees than gene tree parsimony methods when incongruence is caused by gene tree
error, duplications and losses, and/or lateral gene transfer. Furthermore, the
MulRF heuristic runs quickly on data sets containing hundreds of trees with up
to a hundred taxa.Comment: 16 pages, 11 figure
Generalized Buneman pruning for inferring the most parsimonious multi-state phylogeny
Accurate reconstruction of phylogenies remains a key challenge in
evolutionary biology. Most biologically plausible formulations of the problem
are formally NP-hard, with no known efficient solution. The standard in
practice are fast heuristic methods that are empirically known to work very
well in general, but can yield results arbitrarily far from optimal. Practical
exact methods, which yield exponential worst-case running times but generally
much better times in practice, provide an important alternative. We report
progress in this direction by introducing a provably optimal method for the
weighted multi-state maximum parsimony phylogeny problem. The method is based
on generalizing the notion of the Buneman graph, a construction key to
efficient exact methods for binary sequences, so as to apply to sequences with
arbitrary finite numbers of states with arbitrary state transition weights. We
implement an integer linear programming (ILP) method for the multi-state
problem using this generalized Buneman graph and demonstrate that the resulting
method is able to solve data sets that are intractable by prior exact methods
in run times comparable with popular heuristics. Our work provides the first
method for provably optimal maximum parsimony phylogeny inference that is
practical for multi-state data sets of more than a few characters.Comment: 15 page
Effects of phylogenetic reconstruction method on the robustness of species delimitation using single-locus data
1. Coalescent-based species delimitation methods combine population genetic and phylogenetic theory to provide an objective means for delineating evolutionarily significant units of diversity. The Generalized Mixed Yule Coalescent (GMYC) and the Poisson Tree Process (PTP) are methods that use ultrametric (GMYC or PTP) or non-ultrametric (PTP) gene trees as input, intended for use mostly with single-locus data such as DNA barcodes. 2. Here we assess how robust the GMYC and PTP are to different phylogenetic reconstruction and branch smoothing methods. We reconstruct over 400 ultrametric trees using up to 30 different combinations of phylogenetic and smoothing methods and perform over 2,000 separate species delimitation analyses across 16 empirical datasets. We then assess how variable diversity estimates are, in terms of richness and identity, with respect to species delimitation, phylogenetic and smoothing methods. 3. The PTP method generally generates diversity estimates that are more robust to different phylogenetic methods. The GMYC is more sensitive, but provides consistent estimates for BEAST trees. The lower consistency of GMYC estimates is likely a result of differences among gene trees introduced by the smoothing step. Unresolved nodes (real anomalies or methodological artefacts) affect both GMYC and PTP estimates, but have a greater effect on GMYC estimates. Branch smoothing is a difficult step and perhaps an underappreciated source of bias that may be widespread among studies of diversity and diversification. 4. Nevertheless, careful choice of phylogenetic method does produce equivalent PTP and GMYC diversity estimates. We recommend simultaneous use of the PTP model with any model-based gene tree (e.g. RAxML) and GMYC approaches with BEAST trees for obtaining species hypotheses
Bayesian inference of sampled ancestor trees for epidemiology and fossil calibration
Phylogenetic analyses which include fossils or molecular sequences that are
sampled through time require models that allow one sample to be a direct
ancestor of another sample. As previously available phylogenetic inference
tools assume that all samples are tips, they do not allow for this possibility.
We have developed and implemented a Bayesian Markov Chain Monte Carlo (MCMC)
algorithm to infer what we call sampled ancestor trees, that is, trees in which
sampled individuals can be direct ancestors of other sampled individuals. We
use a family of birth-death models where individuals may remain in the tree
process after the sampling, in particular we extend the birth-death skyline
model [Stadler et al, 2013] to sampled ancestor trees. This method allows the
detection of sampled ancestors as well as estimation of the probability that an
individual will be removed from the process when it is sampled. We show that
sampled ancestor birth-death models where all samples come from different time
points are non-identifiable and thus require one parameter to be known in order
to infer other parameters. We apply this method to epidemiological data, where
the possibility of sampled ancestors enables us to identify individuals that
infected other individuals after being sampled and to infer fundamental
epidemiological parameters. We also apply the method to infer divergence times
and diversification rates when fossils are included among the species samples,
so that fossilisation events are modelled as a part of the tree branching
process. Such modelling has many advantages as argued in literature. The
sampler is available as an open-source BEAST2 package
(https://github.com/gavryushkina/sampled-ancestors).Comment: 34 pages (including Supporting Information), 8 figures, 1 table. Part
of the work presented at Epidemics 2013 and The 18th Annual New Zealand
Phylogenomics Meeting, 201
- …