6 research outputs found
On the convergence of the maximum likelihood estimator for the transition rate under a 2-state symmetric model
Maximum likelihood estimators are used extensively to estimate unknown
parameters of stochastic trait evolution models on phylogenetic trees. Although
the MLE has been proven to converge to the true value in the independent-sample
case, we cannot appeal to this result because trait values of different species
are correlated due to shared evolutionary history. In this paper, we consider a
-state symmetric model for a single binary trait and investigate the
theoretical properties of the MLE for the transition rate in the large-tree
limit. Here, the large-tree limit is a theoretical scenario where the number of
taxa increases to infinity and we can observe the trait values for all species.
Specifically, we prove that the MLE converges to the true value under some
regularity conditions. These conditions ensure that the tree shape is not too
irregular, and holds for many practical scenarios such as trees with bounded
edges, trees generated from the Yule (pure birth) process, and trees generated
from the coalescent point process. Our result also provides an upper bound for
the distance between the MLE and the true value
A Consistent Estimator of the Evolutionary Rate
We consider a branching particle system where particles reproduce according
to the pure birth Yule process with the birth rate L, conditioned on the
observed number of particles to be equal n. Particles are assumed to move
independently on the real line according to the Brownian motion with the local
variance s2. In this paper we treat particles as a sample of related
species. The spatial Brownian motion of a particle describes the development of
a trait value of interest (e.g. log-body-size). We propose an unbiased
estimator Rn2 of the evolutionary rate r2=s2/L. The estimator Rn2 is
proportional to the sample variance Sn2 computed from n trait values. We find
an approximate formula for the standard error of Rn2 based on a neat asymptotic
relation for the variance of Sn2
Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress