23 research outputs found

    Measuring Asymmetry in Time-Stamped Phylogenies.

    Get PDF
    Previous work has shown that asymmetry in viral phylogenies may be indicative of heterogeneity in transmission, for example due to acute HIV infection or the presence of 'core groups' with higher contact rates. Hence, evidence of asymmetry may provide clues to underlying population structure, even when direct information on, for example, stage of infection or contact rates, are missing. However, current tests of phylogenetic asymmetry (a) suffer from false positives when the tips of the phylogeny are sampled at different times and (b) only test for global asymmetry, and hence suffer from false negatives when asymmetry is localised to part of a phylogeny. We present a simple permutation-based approach for testing for asymmetry in a phylogeny, where we compare the observed phylogeny with random phylogenies with the same sampling and coalescence times, to reduce the false positive rate. We also demonstrate how profiles of measures of asymmetry calculated over a range of evolutionary times in the phylogeny can be used to identify local asymmetry. In combination with different metrics of asymmetry, this combined approach offers detailed insights of how phylogenies reconstructed from real viral datasets may deviate from the simplistic assumptions of commonly used coalescent and birth-death process models.This work was supported by a Medical Research Council Methodology Research Programme grant to S.D.W.F (grant number MR/J013862/1).This is the final version of the article. It first appeared from PLoS via http://dx.doi.org/10.1371/journal.pcbi.100431

    Biased phylodynamic inferences from analysing clusters of viral sequences.

    Get PDF
    Phylogenetic methods are being increasingly used to help understand the transmission dynamics of measurably evolving viruses, including HIV. Clusters of highly similar sequences are often observed, which appear to follow a 'power law' behaviour, with a small number of very large clusters. These clusters may help to identify subpopulations in an epidemic, and inform where intervention strategies should be implemented. However, clustering of samples does not necessarily imply the presence of a subpopulation with high transmission rates, as groups of closely related viruses can also occur due to non-epidemiological effects such as over-sampling. It is important to ensure that observed phylogenetic clustering reflects true heterogeneity in the transmitting population, and is not being driven by non-epidemiological effects. We qualify the effect of using a falsely identified 'transmission cluster' of sequences to estimate phylodynamic parameters including the effective population size and exponential growth rate under several demographic scenarios. Our simulation studies show that taking the maximum size cluster to re-estimate parameters from trees simulated under a randomly mixing, constant population size coalescent process systematically underestimates the overall effective population size. In addition, the transmission cluster wrongly resembles an exponential or logistic growth model 99% of the time. We also illustrate the consequences of false clusters in exponentially growing coalescent and birth-death trees, where again, the growth rate is skewed upwards. This has clear implications for identifying clusters in large viral databases, where a false cluster could result in wasted intervention resources

    Data from: Measuring asymmetry in time-stamped phylogenies

    No full text
    Previous work has shown that asymmetry in viral phylogenies may be indicative of heterogeneity in transmission, for example due to acute HIV infection or the presence of ‘core groups’ with higher contact rates. Hence, evidence of asymmetry may provide clues to underlying population structure, even when direct information on, for example, stage of infection or contact rates, are missing. However, current tests of phylogenetic asymmetry (a) suffer from false positives when the tips of the phylogeny are sampled at different times and (b) only test for global asymmetry, and hence suffer from false negatives when asymmetry is localised to part of a phylogeny. We present a simple permutation-based approach for testing for asymmetry in a phylogeny, where we compare the observed phylogeny with random phylogenies with the same sampling and coalescence times, to reduce the false positive rate. We also demonstrate how profiles of measures of asymmetry calculated over a range of evolutionary times in the phylogeny can be used to identify local asymmetry. In combination with different metrics of asymmetry, this combined approach offers detailed insights of how phylogenies reconstructed from real viral datasets may deviate from the simplistic assumptions of commonly used coalescent and birth-death process models

    Trees in newick format

    No full text
    This is a .zip file containing the within-host HIV, H5N1 influenza and ebola virus trees in Newick format which were analysed in the paper. Many thanks to Andrew Rambaut (University of Edinburgh) for providing the ebola phylogeny. Data originally from: 1) Within-host HIV (P83_HIV.nwk) Frost SDW, Wrin T, Smith DM, Kosakovsky Pond SL, Liu Y, et al. (2005) Neutralizing antibody responses drive the evolution of human immunodeficiency virus type 1 envelope during recent HIV infection. Proc Natl Acad Sci USA 102: 18514-9 2) H5N1 influenza (H5N1_flu.nwk) Wallace RG, HoDac H, Lathrop RH, Fitch WM (2007) A statistical phylogeography of influenza A H5N1. Proc Natl Acad Sci USA 104: 4473-4478. 3) Ebola Virus (Ebola.nwk) Gire SK, Goba A, Andersen KG, Sealfon RSG, Park DJ, et al. (2014) Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science 345: 1369-1372

    R script for analysis of trees

    No full text
    R script used in the analysis of the HIV, influenza and ebola trees, and to produce figures 4, 6 and 7. Requires the ape, cluster, phylobase and adephylo packages in addition to the treeImbalance package included in this repository

    treeImbalance

    No full text
    An R package containing the functions used for detecting asymmetry as described in the paper

    Measuring Asymmetry in Time-Stamped Phylogenies

    Get PDF
    <div><p>Previous work has shown that asymmetry in viral phylogenies may be indicative of heterogeneity in transmission, for example due to acute HIV infection or the presence of ‘core groups’ with higher contact rates. Hence, evidence of asymmetry may provide clues to underlying population structure, even when direct information on, for example, stage of infection or contact rates, are missing. However, current tests of phylogenetic asymmetry (a) suffer from false positives when the tips of the phylogeny are sampled at different times and (b) only test for global asymmetry, and hence suffer from false negatives when asymmetry is localised to part of a phylogeny. We present a simple permutation-based approach for testing for asymmetry in a phylogeny, where we compare the observed phylogeny with random phylogenies with the same sampling and coalescence times, to reduce the false positive rate. We also demonstrate how profiles of measures of asymmetry calculated over a range of evolutionary times in the phylogeny can be used to identify local asymmetry. In combination with different metrics of asymmetry, this combined approach offers detailed insights of how phylogenies reconstructed from real viral datasets may deviate from the simplistic assumptions of commonly used coalescent and birth-death process models.</p></div

    Permuting a time-stamped tree.

    No full text
    <p>The times of the tips (solid blue lines) and internal nodes (dashed blue lines) from the observed tree (top, black) are preserved in the permuted tree (bottom, dark grey).</p

    Permutations of an observed tree can overcome bias in detecting asymmetry in time-sampled phylogenies.

    No full text
    <p>a) An ‘observed’ tree, simulated under the coalescent model with 100 sequences sampled over 10 time points, each 1000 generations apart, with effective population size of 10<sup>4</sup>. b) The distribution of Sackin’s index and number of cherries for 100 random trees, simulated as in a) except for tips being sampled at a single time point. Expected values for these distributions are shown with dashed black lines. The observed values (solid black line) are highly extreme due to the implicit bias caused by tips sampled early in the ancestry. However, this is not the case when comparing them to a distribution calculated from permuting the observed tree, as seen in c), where there is no evidence to suggest the observed tree is asymmetric and the solid black line falls between the 2.5% and 97.5% quantiles (dashed red lines).</p
    corecore