Ewens sampling formula (ESF) is a one-parameter family of probability
distributions with a number of intriguing combinatorial connections. This
elegant closed-form formula first arose in biology as the stationary
probability distribution of a sample configuration at one locus under the
infinite-alleles model of mutation. Since its discovery in the early 1970s, the
ESF has been used in various biological applications, and has sparked several
interesting mathematical generalizations. In the population genetics community,
extending the underlying random-mating model to include recombination has
received much attention in the past, but no general closed-form sampling
formula is currently known even for the simplest extension, that is, a model
with two loci. In this paper, we show that it is possible to obtain useful
closed-form results in the case the population-scaled recombination rate ρ
is large but not necessarily infinite. Specifically, we consider an asymptotic
expansion of the two-locus sampling formula in inverse powers of ρ and
obtain closed-form expressions for the first few terms in the expansion. Our
asymptotic sampling formula applies to arbitrary sample sizes and
configurations.Comment: Published in at http://dx.doi.org/10.1214/09-AAP646 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org