A comparison of single-sample estimators of effective population sizes from genetic marker data

Abstract

In molecular ecology and conservation genetics studies, the important parameter of effective population size (Ne) is increasingly estimated from a single sample of individuals taken at random from a population and genotyped at a number of marker loci. Several estimators are developed, based on the information of linkage disequilibrium (LD), heterozygote excess (HE), molecular coancestry (MC), and sibship frequency (SF) in marker data. The most popular is the LD estimator, because it is more accurate than HE and MC estimators, and is simpler to calculate than SF estimator. However, little is known about the accuracy of LD estimator relative to that of SF, and about the robustness of all single-sample estimators when some simplifying assumptions (e.g. random mating, no linkage, no genotyping errors) are violated. This study fills the gaps, and uses extensive simulations to compare the biases and accuracies of the 4 estimators for different population properties (e.g. bottlenecks, non-random mating, haplodiploid), marker properties (e.g. linkage, polymorphisms) and sample properties (e.g. numbers of individuals and markers), and to compare the robustness of the 4 estimators when marker data are imperfect (with allelic dropouts). Extensive simulations show that SF estimator is more accurate, has a much wider application scope (e.g. suitable to non-random mating such as selfing, haplodiploid species, dominant markers) and is more robust (e.g. to the presence of linkage and genotyping errors of markers) than the other estimators. An empirical dataset from a Yellowstone grizzly bear population was analysed to demonstrate the use of the SF estimator in practice

Similar works

This paper was published in UCL Discovery.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.