AWF Edwards and the origin of Bayesian phylogenetics

Abstract

In the early 1960s, Anthony Edwards and Luca Cavalli-Sforza made an effort to apply R.A. Fisher’s maximum likelihood (ML) method to estimate genealogical trees of human populations using gene frequency data. They used the Yule branching process to describe the probabilities of the trees and branching times and the Brownian motion process to model the drift of gene frequencies (after a suitable transformation) over time along the branches. They experienced considerable difficulties, including “singularities” in the likelihood surface, mainly because a distinction between parameters and random variables was not clearly made. In the process they invented the distance (additive-tree) and parsimony (minimum-evolution) methods, both of which they viewed as heuristic approximations to ML. The statistical nature of the inference problem was not clarified until Edwards 1, which pointed out that the trees should be estimated from their conditional distribution given the genetic data, rather than from the “likelihood function”. In modern terminology, this is the Bayesian approach to phylogeny estimation: the Yule process specifies a prior on trees, while the conditional distribution of the trees given the data is the posterior. This article discusses the connections of the remarkable paper of Edwards 1 to modern Bayesian phylogenetics, and briefly comments on some modelling decisions Edwards made then that still concern us today in modern Bayesian phylogenetics. The reader I have in mind is familiar with modern phylogenetic methods but may not have read Edwards, which is published in a statistics journal

    Similar works