Inference in population genetics using forward and backward, discrete and continuous time processes

Abstract

DS and CK were partially funded by FWF-P24551-B25. CK has been partially funded by the Vienna Science and Technology Fund (WWTF) through project MA16-061.A central aim of population genetics is the inference of the evolutionary history of a population. To this end, the underlying process can be represented by a model of the evolution of allele frequencies parametrized by e.g., the population size, mutation rates and selection coefficients. A large class of models use forward-in-time models, such as the discrete Wright-Fisher and Moran models and the continuous forward diffusion, to obtain distributions of population allele frequencies, conditional on an ancestral initial allele frequency distribution. Backward-in-time diffusion processes have been rarely used in the context of parameter inference. Here, we demonstrate how forward and backward diffusion processes can be combined to efficiently calculate the exact joint probability distribution of sample and population allele frequencies at all times in the past, for both discrete and continuous population genetics models. This procedure is analogous to the forward-backward algorithm of hidden Markov models. While the efficiency of discrete models is limited by the population size, for continuous models it suffices to expand the transition density in orthogonal polynomials of the order of the sample size to infer marginal likelihoods of population genetic parameters. Additionally, conditional allele trajectories and marginal likelihoods of samples from single populations or from multiple populations that split in the past can be obtained. The described approaches allow for efficient maximum likelihood inference of population genetic parameters in a wide variety of demographic scenarios.PostprintPeer reviewe

    Similar works