Maximum likelihood (ML) estimators for scaled mutation parameters with a strand symmetric mutation model in equilibrium

Abstract

With the multiallelic parent-independent mutation-drift model, the equilibrium proportions of alleles are known to be Dirichlet distributed. A special case is the biallelic model, in which the proportions are beta distributed. A sample taken from these models is then Dirichlet-multinomially or beta-binomially distributed, respectively. Maximum likelihood (ML) estimators for the mutation parameters of the biallelic parent-independent mutation model are available via an expectation maximization algorithm. Assuming small scaled mutation rates, the distribution of a sample of size MM can be expanded in a Taylor series of first order. Then the ML estimators for the two parameters in the biallelic model can be expressed using the site frequency spectrum. In this article, we go beyond parent-independent mutation and analyse a strand-symmetric mutation model with six scaled mutation parameters that deviates from parent independent mutation and, generally, from detailed balance. We derive ML estimators for these six parameters assuming mutation-drift equilibrium and small scaled mutation rates. This is the first time that ML estimators are provided for a mutation model more complex than parent-independent mutation

    Similar works