14,306 research outputs found
Principal components analysis in the space of phylogenetic trees
Phylogenetic analysis of DNA or other data commonly gives rise to a
collection or sample of inferred evolutionary trees. Principal Components
Analysis (PCA) cannot be applied directly to collections of trees since the
space of evolutionary trees on a fixed set of taxa is not a vector space. This
paper describes a novel geometrical approach to PCA in tree-space that
constructs the first principal path in an analogous way to standard linear
Euclidean PCA. Given a data set of phylogenetic trees, a geodesic principal
path is sought that maximizes the variance of the data under a form of
projection onto the path. Due to the high dimensionality of tree-space and the
nonlinear nature of this problem, the computational complexity is potentially
very high, so approximate optimization algorithms are used to search for the
optimal path. Principal paths identified in this way reveal and quantify the
main sources of variation in the original collection of trees in terms of both
topology and branch lengths. The approach is illustrated by application to
simulated sets of trees and to a set of gene trees from metazoan (animal)
species.Comment: Published in at http://dx.doi.org/10.1214/11-AOS915 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Simultaneous reconstruction of evolutionary history and epidemiological dynamics from viral sequences with the birth-death SIR model
The evolution of RNA viruses such as HIV, Hepatitis C and Influenza virus
occurs so rapidly that the viruses' genomes contain information on past
ecological dynamics. Hence, we develop a phylodynamic method that enables the
joint estimation of epidemiological parameters and phylogenetic history. Based
on a compartmental susceptible-infected-removed (SIR) model, this method
provides separate information on incidence and prevalence of infections.
Detailed information on the interaction of host population dynamics and
evolutionary history can inform decisions on how to contain or entirely avoid
disease outbreaks.
We apply our Birth-Death SIR method (BDSIR) to two viral data sets. First,
five human immunodeficiency virus type 1 clusters sampled in the United Kingdom
between 1999 and 2003 are analyzed. The estimated basic reproduction ratios
range from 1.9 to 3.2 among the clusters. All clusters show a decline in the
growth rate of the local epidemic in the middle or end of the 90's.
The analysis of a hepatitis C virus (HCV) genotype 2c data set shows that the
local epidemic in the C\'ordoban city Cruz del Eje originated around 1906
(median), coinciding with an immigration wave from Europe to central Argentina
that dates from 1880--1920. The estimated time of epidemic peak is around 1970.Comment: Journal link:
http://rsif.royalsocietypublishing.org/content/11/94/20131106.ful
SigTree: A Microbial Community Analysis Tool to Identify and Visualize Significantly Responsive Branches in a Phylogenetic Tree.
Microbial community analysis experiments to assess the effect of a treatment intervention (or environmental change) on the relative abundance levels of multiple related microbial species (or operational taxonomic units) simultaneously using high throughput genomics are becoming increasingly common. Within the framework of the evolutionary phylogeny of all species considered in the experiment, this translates to a statistical need to identify the phylogenetic branches that exhibit a significant consensus response (in terms of operational taxonomic unit abundance) to the intervention. We present the R software package SigTree, a collection of flexible tools that make use of meta-analysis methods and regular expressions to identify and visualize significantly responsive branches in a phylogenetic tree, while appropriately adjusting for multiple comparisons
- …