2,785 research outputs found

    Fundamental limits on the accuracy of demographic inference based on the sample frequency spectrum

    Full text link
    The sample frequency spectrum (SFS) of DNA sequences from a collection of individuals is a summary statistic which is commonly used for parametric inference in population genetics. Despite the popularity of SFS-based inference methods, currently little is known about the information-theoretic limit on the estimation accuracy as a function of sample size. Here, we show that using the SFS to estimate the size history of a population has a minimax error of at least O(1/logs)O(1/\log s), where ss is the number of independent segregating sites used in the analysis. This rate is exponentially worse than known convergence rates for many classical estimation problems in statistics. Another surprising aspect of our theoretical bound is that it does not depend on the dimension of the SFS, which is related to the number of sampled individuals. This means that, for a fixed number ss of segregating sites considered, using more individuals does not help to reduce the minimax error bound. Our result pertains to populations that have experienced a bottleneck, and we argue that it can be expected to apply to many populations in nature.Comment: 17 pages, 1 figur

    Multi-locus analysis of genomic time series data from experimental evolution.

    Get PDF
    Genomic time series data generated by evolve-and-resequence (E&R) experiments offer a powerful window into the mechanisms that drive evolution. However, standard population genetic inference procedures do not account for sampling serially over time, and new methods are needed to make full use of modern experimental evolution data. To address this problem, we develop a Gaussian process approximation to the multi-locus Wright-Fisher process with selection over a time course of tens of generations. The mean and covariance structure of the Gaussian process are obtained by computing the corresponding moments in discrete-time Wright-Fisher models conditioned on the presence of a linked selected site. This enables our method to account for the effects of linkage and selection, both along the genome and across sampled time points, in an approximate but principled manner. We first use simulated data to demonstrate the power of our method to correctly detect, locate and estimate the fitness of a selected allele from among several linked sites. We study how this power changes for different values of selection strength, initial haplotypic diversity, population size, sampling frequency, experimental duration, number of replicates, and sequencing coverage depth. In addition to providing quantitative estimates of selection parameters from experimental evolution data, our model can be used by practitioners to design E&R experiments with requisite power. We also explore how our likelihood-based approach can be used to infer other model parameters, including effective population size and recombination rate. Then, we apply our method to analyze genome-wide data from a real E&R experiment designed to study the adaptation of D. melanogaster to a new laboratory environment with alternating cold and hot temperatures

    The In-Hospital Mortality Rates of Slaves and Freemen: Evidence from Touro Infirmary, New Orleans, Louisiana, 1855–1860

    Get PDF
    Using a rich sample of admission records from New Orleans Touro Infirmary, we examine the in-hospital mortality risk of free and enslaved patients. Despite a higher mortality rate in the general population, slaves were significantly less likely to die in the hospital than the whites. We analyze the determinants of in-hospital mortality at Touro using Oaxaca-type decomposition to aggregate our regression results. After controlling for differences in characteristics and maladies, we find that much of the mortality gap remains unexplained. In conclusion, we propose an alternative explanation for the mortality gap based on the selective hospital admission of slaves.hospital, slavery, Oaxaca-type decomposition, New Orleans, Touro

    Inference of Population History using Coalescent HMMs: Review and Outlook

    Full text link
    Studying how diverse human populations are related is of historical and anthropological interest, in addition to providing a realistic null model for testing for signatures of natural selection or disease associations. Furthermore, understanding the demographic histories of other species is playing an increasingly important role in conservation genetics. A number of statistical methods have been developed to infer population demographic histories using whole-genome sequence data, with recent advances focusing on allowing for more flexible modeling choices, scaling to larger data sets, and increasing statistical power. Here we review coalescent hidden Markov models, a powerful class of population genetic inference methods that can effectively utilize linkage disequilibrium information. We highlight recent advances, give advice for practitioners, point out potential pitfalls, and present possible future research directions.Comment: 12 pages, 2 figure
    corecore